Limit Values and Factors Influencing Limit Values for Spruce

(1)

Limit Values and Factors

Influencing Limit Values for Spruce

Author:

Zhang Liming

Supervisors:

Lennart Norell

and

Johan Lyhagen

(2)

Limit Values and Factors Influencing Limit Values

for Spruce

Zhang Liming1

1 Department of Statistics, Uppsala University, Uppsala, Sweden.

Abstract

We collected the data for decomposition of spruce litter to determine the limit values of mass loss and to find both chemical and climate factors that influence limit values. Our data contained 28 sequences of spruce which mainly in Sweden and a small part in other places. We choose mean annual temperature (MAT) and mean annual precipitation (MAP) as climate factors and water solubles, lignin, N, P, K, Ca, Mg and Mn as chemical factors. Then we got the estimated limit values by performing a nonlinear model with mass loss and time spots, and found out the influential factors by using another linear mixed model. At the end we knew that linear mixed model is a proper and efficient approach for determining the factors, P and MAP are the significant factors and Species is a good random effect to explain the variance within groups.

Key Words: Limit values, Litter decomposition, Nonlinear model, Linear mixed model.

(3)

INTRODUCTION

The study of late decomposition of plant litter is much less than the early ones. And it deserve more attention in order to find out factors regulate late-stage decompo-sition rates. Some studies have shown that the rates decline as decompodecompo-sition progress going (FOGEL and CROMACK (1977)) and even may approach zero (BERG and EK

-BOHM(1991)). In such case, the accumulated mass loss approaches a maximum value for decomposition, it also called a “limit value”. Figure1is a sample of this value. By modeled as a asymptote of a mathematical function, the estimated limit value may not show a clear stop of decomposition but could give a stage with a very low decomposi-tion rate. Such a limit value deserves more exploradecomposi-tion in order to find out the factors which influence it.

Although limit values don’t mean the decomposition progress completely stopped, but the rest organic matter decomposes very slowly if limit values were reached. Con-sidering limit values with climate, temperature, moisture, litter components and other influential factors, we can get an important insight of the decomposition progress.

The reasons why limit value exists are that some components of litter hard to decompose while others seem to decompose completely are not so clear, but may be

(4)

related to the nutritional requirements or constraints of the decomposing microbial community. For example, N (Nitrogen) has been reported to has effect on the remains of degrading lignin to form recalcitrant condensation products. Such product may be very hard to degrade due to some chemical reason (N ¨OMMIKand VAHTRAS(1982)).

PICCOLOet al. (1999) found that some parts of such bonds create a hydrophobic

sur-face thereby resisting decomposition. The higher amount of lignin and N in the litter the more likely the such bonds will be formed.BERGet al.(2000) used 106 sets of foliar litter comprising 21 tree species representing a wide range in chemical composition and found a highly significant negative relationship between limit value and initial N amount in litter.

The initial amount of Mn in litter has positive effect on decomposition rate in lit-ter of Norway spruce (BERGet al.(2000)). More, positive relationships between annual mass loss and Mn amount was found in pine needle litter (BERGet al.(2007)).

Consid-ering the relationship between Mn and decomposition rate, a positive relationship has been found between Mn amount and limit value (BERGet al. (1996)) which implying

a litter with higher initial Mn amount has a higher accumulated mass loss.

The aim of this paperis to find factors that influence limit value for spruce litter, such as climate, temperature and litter components etc. And we also try to find the difference in these influence factors between different species or different sites. In this paper we will construct some different models, such as the model only contain the chemical components, the model contain chemical components and climate factors (temperature and precipitation), the model specified by sites, the model specified by species etc. For different model, there may be different factor plays it’s role. In Section 2 we will explain the statistical methods used in this paper while in Sections 3 and 4 we will analyze the data and give a detailed explanation to the results. At the end, a conclusion ends the paper.

(5)

METHODS

BERG et al.(2010) performed a nonlinear model to relate mass loss to time spots, here we also do the same attempt. And then we will use a linear mixed model to find the influential factors.

Nonlinear Model:

A linear model is usually presented as

Y= β0+β1Z1+β2Z2+. . .+β_kZ_k+ε (1)

where the Zican present any functions of the basic predictor variables, the linear is the

definition of parameters but not the predictors. Any model that is not of the form given by Equation (1) will be called a nonlinear model, which is, nonlinear in the parameters. For example, in this paper, we have a nonlinear model in the form of

yi =θ1(1−e −θ2 xi

θ1 _{) +}_ε_i ₍₂₎

where yi is observed mass loss values at time xi, εi is the error item. After a simple

derivation we can see that

lim xi→∞ E(yi) =θ1 E(yi) ≈θ1(1− (1− θ2xi θ1 )) =θ2xi (when xigoing to 0)

So the parameters θ1 and θ2 can be seen as the estimated limit value of mass loss

and the estimated initial decomposition rate. The model is meaningful if θ2 > 0 and

0<θ1< 1. The estimations of parameters can be got by nlm function inR DEVELOP

-MENTCORETEAM(2011).

The standard form of nonlinear model can be presented as follows, it is not the same as linear cases. In order to describe more clearly, we changed the parameters βi

to θi.

(6)

If we write

X= (x1, x2, . . . , xk)0

θ= (θ1, θ2, . . . , θp)0

We can give a short form of Equation (3) as

Y = f(X, θ) +ε (4)

The assumption of normality and independence of the errors is the same as linear case,

ε ∼ N(0, Iσ2), 0 is a vector of zeros and I is a n-dimension diagonal matrix with 1 in

the diagonal line. n equal to the number of observations. Then the sum square of errors for this nonlinear model and given data is

S(θ) =

n

∑

u=1

[Yu− f(xu, θ)]2

Here the sum of square is a function of θ since Yuand xuare fixed observations. What

we should do is to find a ˆθ as an estimate of θ whose value should minimize S(θ).

Now, we give an initial value θ0 = (θ10, θ20)to θ. We expand f(x, θ)at θ0 into a

Taylor series f(x, θ) = f(x, θ0) + 2

∑

i=1 ∂ f(x, θ) ∂θi θ=θ0 (θ_i−θ_i0) If we set f_u0= f(xu, θ0) =θ10(1−e− θ20 x θ10 ) β0_i =θ_i−θ_i0 z0_iu = ∂ f(xu, θ) ∂θi θ=θ0 (In this paper, z0_1u=1− (θ20xu θ10 +1)e−θ20 xθ10 _{, z}0 2u= xue− θ20 x θ10 )

We can present Equation (4) as

yu− fu0 = 2

∑

i=1

(7)

It’s in the form of a linear model. We can estimate the parameters β0_i. If we write Z0=         z0₁₁ z0₂₁ z0₁₂ z0₂₂ .. . ... z0_1n z0_2n         , B0=   b0 1 b0₂  , Y₀ =         y1− f01 y2− f₀1 .. . yn− fn1         Then the estimate of β= (β0₁, β0₂)is given by the mixed model equation

ˆ

B0= (Z00Z0)−1Z00Y0

The vector B0will minimize the sum of squares

S(θ) = n

∑

u=1 [yu− f(xu, θ0) − 2

∑

i=1 β0_iz0_i0]2

If we denote θ1 = (θ11, θ21)where θi1 = b_i0+θi0( i = 1, 2), it can be thought as a

revised best estimated of θ. And as a sequence, θj+1 = θj+Bj = θj+ (Z0jZj)−1Z0jYj.

Go on this iteration till the estimate of parameters θj satisfied |θi(j+1)−θij

θij

| <δ, i=1, 2

where δ is an enough small quantity. (BATESand WATTS(1988) andDRAPER(1998))

Linear Mixed Model:

The estimated limit values were given by Equation (2). Then, the independent vari-ables (lignin, K, etc.) can be related to the estimated limit values by using a linear mixed model. A general linear model is formed as

yi = β0+β1x1i+β2x2i+. . .+β_kx_ki+εi (5)

or

Y=Xβ+ε

and the normality assumption is ε∼ N(0, Iσ2₎_{. However, in reality, we often observe}

(8)

family. It is very reasonable to suspect that the individuals within the same cluster are correlated. As a consequence, the parameters estimated become inefficient and the standard error are wrong thus any inference based on them are incorrect. A solution to this problem is, assume that there p clusters in the data and the Equation (5) can be modified as

yji = β0+β1x1ji+β2x2ji+. . .+βkxkji+uj+εji

i=1, 2, . . . , nj; j=1, 2, . . . , p

or

Y= Xβ+Zu+ε

where Y is a vector of observations, with mean E(Y) =Xβ; β is a vector of coefficients of design matrix X; u is a vector of IID (independent identically distributed) random effects with mean E(u) = 0 and covariance matrix Var(u) = G; ε is a vector of IID random errors with mean E(ε) = 0 and variance Var(e) = R; X and Z are the design

matrices. Then, the estimation of β and u is given by   X0R−1_X _X0_R−1_Z Z0R−1X Z0R−1Z+G−1     ˆ β ˆ u  =   X0R−1_Y Z0R−1Y   SeeDEMIDENKO(2004). DATA DESCRIPTION

The data we use is collected from different places, 14 sites are located in Sweden, include 24 Norway Spruces (Nspr) and 1 GrNspr in total. The other three samples are from Cesareni, a place in Italy. There are two parts in the data. One is the climate part and the other is chemical part. In climate part, we need mean annual temperature (MAT, unit: ◦C) and mean annual precipitation (MAP, unit: mm). In chemical part, we record initial litter concentrations of water soluble cell contents (WatSol), lignin, nitrogen (N), phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg) and manganese (Mn). Units of chemical components are all mg/g. The data was shown in Figure2, which is

(9)

Table 1: Summary of the estimates. ˆm is the estimate of limit values (m∗) and ˆk is the estimate of initial decomposition rates (k∗).

Index mˆ ˆk Index mˆ ˆk Index mˆ ˆk Index mˆ ˆk 1 1.00 0.22 8 0.72 0.43 15 0.61 0.25 22 0.51 0.36 2 0.84 0.36 9 0.85 0.41 16 0.68 0.51 23 0.56 0.30 3 0.85 0.34 10 0.81 0.33 17 0.53 0.28 24 0.55 0.37 4 0.97 0.22 11 0.63 0.31 18 0.54 0.26 25 0.54 0.33 5 0.72 0.29 12 0.71 0.43 19 1.00 0.17 26 0.61 0.35 6 0.66 0.35 13 0.89 0.40 20 0.76 0.20 27 0.50 0.78 7 0.85 0.34 14 0.65 0.39 21 0.68 0.18 28 0.54 0.61 a descriptive view of the data we have. Because a big part of our data is from Sweden, hence I only point out the different species here. And from the figure we can ob-serve that for some chemical components(like N, P) there seems no difference among species, but for the other chemical components (like lignin, K) different species show difference clearly in concentrations.

DATA ANALYSIS

Nonlinear Model:

In the first step, we should get the estimates of limit values. Relating MassLoss and Time by a nonlinear model, limit values can be estimated. We use ˆm to denote the estimate of m∗and ˆk for k∗. Table1is a summary of the estimates.

There are some values larger than 1, but according to the meaning of this param-eter, it should not be larger than 1, so we revised them to 1. We estimated 28 limit values in total by using nonlinear model (Equation2). Figure3is a visual view of the estimates. It seems that all ˆm are larger than 0.5 and most of them are in[0.6, 0.8]. As for ˆk, most are in the range[0.1, 0.5], only two have unusual large values, which are

(10)

Table 2: Comparison for variance explained by two random effects (by a full model) . Random Effects Species Site Variance 0.051 0.001 Residual 0.011 0.012 Model AIC 65.95 66.40

Sequence 27 and 28, that means these two sequences have quite large initial decompo-sition rates. This question will be discussed later.

Figure4is four samples of limit values’ estimation with confidence interval and observation points. It seems that most limit values can be reached before 10 years, only a few sequences have a still-increasing mass loss after 10 years, just like Sequence 1 showed. Confidence interval showed that the limit values have a limited uncertainty range for serving as responses in the following linear mixed model. Some confidence intervals have a upper bound above 100%, like Sequence 1, that is caused by a quite high estimated limit value. And some other confidence intervals have a non-stop in-creasing range as time grows up, so finally the confidence interval can be very large, this is due to a big variance of estimated limit value (m∗).

Linear Mixed Model:

The second step, the obtained estimated limit values will be used as responses for the linear mixed model, which to analyze the influncial factors for limit values. In this mixed model, there are two factors can use as random effect, which are Species and Site. By modeling a full model with these two factors separately, we can find Species can explain more variance than Site does, so finally Species was chosen as random effect. The AICs also showed that Species is a better choice as random effect. Table2

showed the details.

As a checking of independent variables, Table 3and Figure5showed the corre-lation of all independent variables. From the figure, we may find that WatSol, K and

(11)

Table 3: Independent variables correlation table. ˆ

m WatSol Lignin N P K Ca Mg Mn MAT WatSol 0.00 Lignin -0.23 -0.84 N -0.28 0.00 0.32 P 0.34 0.32 -0.32 -0.17 K 0.17 0.74 -0.80 -0.04 0.36 Ca 0.31 0.04 -0.20 -0.09 -0.05 0.25 Mg -0.30 0.17 0.09 0.78 -0.31 -0.03 -0.20 Mn 0.29 0.12 -0.19 -0.36 0.39 -0.04 -0.02 -0.13 MAT -0.17 0.01 -0.12 -0.17 -0.26 -0.17 -0.07 0.03 0.16 MAP -0.47 -0.19 0.24 0.02 0.26 -0.16 -0.54 -0.01 0.14 0.32 Lignin have strong correlation, the accurate values were presented in Table3. So there is only one variable can be reserved in general. We can also find that P, MAP, Ca and Mg seem to have some relationship with response ˆm (not so strong but better than others), so these variables could have priority in independents selection.

The first model, we use all sequences (28 in total, 25 in Sweden) as data and all variables as independents, exclude the ones who have strong correlation. As we said above, WatSol, K and Lignin may only have one reserved. In addition, Lignin has stronger relationship than K does and WatSol has no relationship with ˆm, so we can have Lignin kept. Adding N, P, Ca, Mg, Mn, MAT and MAP, there are 8 independents in the model, the result is shown in Table4.

As the Table4showed, N, P, Mn and MAT have positive relationship on ˆm while Lignin, Ca, Mg and MAP have negative relationship. P has the largest coefficient (ab-solute value), which suggested P has the most impact on ˆm. In this model, degree of freedom is 28−9=19, according to the t-distribution, the threshold of 95% confidence interval is 2.09. The t-value whose absolute value larger than 2.09 could be considered significant. By removing the non-significant independents out of model step by step,

(12)

Table 4: Summary of full model. Linear Mixed Model

Formula mˆ ∼Lignin+N+P+Ca+Mg+Mn+MAT+MAP

AIC 47.62 Random effects Variance Std.Dev. Species 0.12 0.34 Residual 0.01 0.10 Fixed effects

Estimate Std.Error t-value

(Intercept) 1.10 0.40 2.74

Lignin -1.0E-3 1.2E-3 -0.89

N 0.01 0.02 0.71 P 0.38 0.13 2.79 Ca -3.8E-3 4.4E-3 -0.87 Mg -0.04 0.04 -0.85 Mn 2.5E-4 0.03 8.0E-3 MAT 0.02 0.02 1.14

MAP -9.2E-4 2.9E-4 -3.20

the model come to be the final version. In the final version, only P and MAP reserved. Then we use the sequences only collected in Sweden to fit a comparison model, the details are shown in Table5.

From the summary of final model, we can observe that both P and MAP have a significant t-value which means P and MAP have a significant relationship with ˆm. P has a positive effect and MAP has a clear but small negative effect. And in the two models, as the random effect, Species explained most of variances, this proofs that choosing a linear mixed model and Species as random effect is reasonable. About the differences between the two model, first, the model using only Sweden sequences

(13)

Table 5: Summary of final model. Linear Mixed Model

Formula mˆ ∼P+MAP

All sequences Sweden sequences only

AIC -7.40 -8.70

Random effects

Variance Std.Dev. Variance Std.Dev.

Species 0.48 0.70 0.04 0.20

Residual 0.01 0.10 0.01 0.10

Fixed effects

Estimate Std.Error t-value Estimate Std.Error t-value (Intercept) 1.17 0.45 2.59 0.86 0.19 4.48

P 0.36 0.08 4.41 0.36 0.08 4.30

MAP -7.0E-4 1.7E-4 -4.20 -7.8E-4 1.8E-4 -4.27 have a smaller AIC than the other one does, this may because the data from other places doesn’t fitted well, another evidence is the standard error of Intercept become smaller, from 0.45 to 0.19. The second, the model using all sequences have a large percentage of variance explanation. By removing the sequences collected from the sites out of Sweden, the percentage of variance explanation by random effect reduced from ₍_0.480.48₊_0.01₎ = 98% to ₍_0.040.04₊_0.01₎ = 80%, this may imply the sites out of Sweden contain some species that have obvious difference from the species in Sweden, and the climate may be quite different from Sweden too.

Let’s do the model checking next. First, a Q-Q plot was drew for checking resid-uals’ normality (Figure6). From the figure we can see that the residuals almost shape a straight line, means a good result for normality checking. A more accurate evidence and a residual independence checking are in Table 6. Here we use Shapiro test for checking the normality and Ljung-Box test for independency, the two tests are for the residuals. The result gave us a good news, a 0.55 p-value for Shapiro test means can’t

(14)

Table 6: Model checking.

Shapiro Ljung Box P MAP p-value 0.55 0.45 VIF 1.22 1.22

reject the null hypothesis which is “residual is normal distributed” and a 0.45 p-value for Ljung Box test means can’t reject the null hypothesis which is “residual is independently distributed”. VIF, which is short for Variance Inflation Factor, provides an index that measures how much the variance of an estimated regression coefficient is increased because of collinearity. In general, a VIF value less than 10 can be thought acceptable. Here, VIF values for independent variables in Table6 proofs that there is no multi-collinearity in independent variables.

Further Discussion:

As we mentioned above, there are two sequences having a very high initial de-composition rate. By using ˆk instead of ˆm, we can fit another model to find the reason, the steps are the same as above. In this model, Lignin still has the largest correlate coefficient with response in Lignin (-0.61), Watsol (0.48) and K (0.47). But in this model, both Species and Sites couldn’t able to explain any variance, so we use linear model instead of linear mixed model. The final results are shown in Table7.

From the summary, we know that Lignin and MAP have significant relationship with initial decomposition rate (k∗). And Lignin has negative effect while MAP has positive effect. It can be explained by biological function of lignin. Lignin fills the spaces in the cell wall between cellulose, hemicellulose, and pectin components, es-pecially in tracheids, sclereids and xylem. It is covalently linked to hemicellulose and thereby crosslinks different plant polysaccharides, conferring mechanical strength to the cell wall and by extension the plant as a whole (CHABANNES1 et al.(2001 Nov)). So low lignin concentration means low strength of cell wall, which make cell easy to decomposition.

(15)

Table 7: Summary of model of ˆk. Linear Model

Formula ˆk∼ Lignin+MAP AdjustedR2 _0.64

Coefficients

Estimate Std.Error p-value (Intercept) 0.74 0.12 2.1E-6 Lignin -1.7E-3 3.9E-4 2.4E-4 MAP 2.3E-4 3.3E-5 3.2E-7

CONCLUSION

In this paper, we used a nonlinear model to fit the limit values of mass loss of plant decomposition progress and a linear mixed model to research the influential factors of limit value, find that P and MAP are the significant factors and Species is a good random effect factor. We compared the models using all sequences and Swedish sequences, the significant factors are the same but there is some small difference be-tween the coefficients, especially the variance components, that implied some impor-tant species included in foreign sequences. We also fitted a linear model for research-ing the influential factors of initial decomposition rate and found Lignin and MAP. Then we tried to explain the reason why these two factors are significant.

(16)

LITERATURE CITED

BATES, D. M., and D. G. WATTS, 1988 Nonlinear regression analysis and its applications. Wiley.

BERG, B., M. P. DAVEY, A. D. MARCO, B. EMMETT, M. FAITURI, et al., 2010 Factors influencing limit values for pine needle litter decomposition: a synthesis for boreal and temperate pine forest systems. Biogeochemistry 100: 57–73.

BERG, B., and G. EKBOHM, 1991 Litter mass-loss rates and decomposition patterns in

some needle and leaf litter types. long-term decomposition in a scots pine forest. vii. Canadian Journal of Botany 69: 1449–1456.

BERG, B., M.-B. JOHANSSON, G. EKBOHM, C. MCCLAUGHERTY, F. RUTIGLIANO, et al., 1996 Maximum decomposition limits of forest litter types: a synthesis. Cana-dian Journal of Botany 74: 659–672.

BERG, B., M.-B. JOHANSSON, and V. MEENTEMEYER, 2000 Litter decomposition in a

transect of norway spruce forests: substrate quality and climate control. Canadian Journal of Forest Research 30(7): 1136–1147.

BERG, B., K. T. STEFFEN, and C. MCCLAUGHERTY, 2007 Litter decomposition rate is dependent on litter mn concentrations. Biogeochemistry 82(1): 29–39.

CHABANNES1, M., K. RUEL, A. YOSHINAGA, B. CHABBERT, A. JAUNEAU, et al., 2001 Nov In situ analysis of lignins in transgenic tobacco reveals a differential impact of individual transformations on the spatial patterns of lignin deposition at the cellular and subcellular levels. Plant Journal 28(3): 271–282.

DEMIDENKO, E., 2004 Mixed Models - Theory and Applications. Wiley. DRAPER, 1998 Applied Regression Analysis 3rd Edition. Wiley.

FOGEL, R., and K. CROMACK, 1977 Effect of habitat and substrate quality on douglas fir litter decomposition in western oregon. Canadian Journal of Botany 55: 1632– 1640.

N ¨OMMIK, H., and K. VAHTRAS, 1982 Nitrogen in agricultural soils. American

Soci-ety of Agronomy, Inc; Crop Science SociSoci-ety of America, Inc; Soil Science SociSoci-ety of America, Inc.

(17)

PICCOLO, A., R. SPACCINI, G. HABERHAUER, and M. H. GERZABEK, 1999 Increased sequestration of organic carbon in soil by hydrophobic protection. Naturwis-senschaften 86: 496–499.

R DEVELOPMENTCORETEAM, 2011 R: A Language and Environment for Statistical

Com-puting. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.

(18)

Figure 2: Data distribution overview. Black circles are Nspr spruces, the green ones are GrNspr spruces and red ones are Silver fir.

(19)

(20)

Figure 4: Plot of estimated limit values(m∗) and confidence intervals. Green line is the estimated mass loss curve, two red lines construct the confidence interval and blue points are observations.

(21)

(22)