Estimation of Probabilities of Detection for Cracks in Pipes in Swedish Nuclear Power Plants

(1)

Estimation of Probabilities of Detection for Cracks

in Pipes in Swedish Nuclear Power Plants

Lina Tidström

U.U.D.M. Project Report 2004:2

Examensarbete i matematisk statistik, 20 poäng Handledare: Björn Brickstad och Tomas Jelinek, DNV

Examinator: Sven Erick Alm Februari 2004

(2)

Abstract

Cracks in cooling pipes in nuclear power plants are a security risk if growing without being detected. A big crack might cause leakage and rupture. Each summer, non-destructive tests, NDT, are performed under the so-called revision, at the Swedish nuclear power plants, when pipes and components are investigated for cracks and defects. There are different methods and testing techniques. The goal of this study is to estimate the efficiency of NDT performed with ultrasonic testing for detection of intergranular stress corrosion cracks in cooling pipes. The effectiveness was measured as the probability of detection, POD, expressed as a function of crack size. Data used for the estimation consisted of detection results from qualification tests and MTO studies performed at SQC, Swedish Qualification Center. The statistical method used was generalized linear models, and SAS was used for the calculations. The estimation resulted in a model where POD depends on absolute crack depth. Relative depth has been used in some other studies, but here relative depth is strongly non-significant, i.e. does not affect the detection probabilities. The study was done for DNV, Det Norske Veritas, as a part of a research project performed by DNV, commissioned by SKI, Statens

(3)

Acknowledgement

(4)

Abstract ... 0

Acknowledgement... 1

1 Introduction ... 2

1.1 POD, Probability Of (correct) Detection... 3

2 Description of data ... 5

2.1 Qualification data ... 5

2.1.1 Example of qualification data ... 5

2.2 MTO data ... 6

2.3 Comments on qualification and MTO data ... 7

2.4 The cracks ... 8

2.4.1 Definitions of the crack variables... 8

2.4.2 About the variables... 9

2.4.3 The available cracks ... 10

3 Method ... 13

3.1 Correct detection ... 13

3.2 Model ... 13

3.2.1 Distributions ... 13

3.2.2 Generalized linear models (GLIM) ... 14

3.2.3 Overdispersion ... 14

4 Results ... 15

5 Discussion and conclusions... 23

References ... 25

Appendices ... 26

A. Terminology ... 26

PSA, Probabilistic Safety Assessment ... 26

NDT, Non-destructive testing ... 26

IGSCC, Intergranular Stress Corrosion Cracking ... 26

UT-01, Ultrasonic Testing procedure... 27

Round Robin-trials ... 27

PISC ... 27

Qualification... 28

MTO, Man-Technology-Organization ... 29

HAZ, Heat Affecting Zone... 29

CDE, Cold-Deformed Elbows... 29

DID, Service Induced Defects (Drift Inducerade Defekter)... 30

DNV, Det Norske Veritas ... 30

SQC, Swedish Qualification Center... 30

SKI, Swedish Nuclear Power Inspectorate (Statens Kärnkraftinspektion) ... 30

Nuclear Activities in Sweden... 30

B. Statistical theory... 32

GLIM, Generalized Linear Models... 32

The exponential family... 32

Log likelihood function... 32

Link function ... 34

(5)

1 Introduction

For estimation of risk levels in complex systems, subsystems or components in an industry, so-called risk based inspections are used more and more. The purpose of the estimation is to evaluate and plan risk reduction actions in an optimal way; both in aspect of economics and concerning security reasons. Profit, competition and security demands are factors making that kind of risk analyses more and more desirable. Probabilistic Safety Assessment (PSA) is one method, which is dominating within the nuclear power industry.

Cooling pipes in nuclear power plants might be a security risk if there is a rupture and leakage. The conditioned probability of core damage given leakage and rupture,

C = P(core damage | leakage), can be estimated in a PSA-analysis, and the risk level of a component calculated as the product of the conditioned probability and the probability of leakage and rupture. The calculated risk levels of components are used to evaluate and plan possible actions taken for security reasons. To make security systems more reliable is one way to reduce the risk level, another is to affect the probability of leakage for a component.

The probability of leakage is changed if a component for example is repaired or exchanged, if it is investigated and examined for cracks and defects or if there is any mechanism observing indications of leakage in an early stage. Change of service conditions or redesign also affects the probability of leakage.

When examining components for cracks and defects by so-called non-destructive testing (NDT), the opportunity to evaluate and control risk levels for critical components is good. If no crack is detected the risk level is decreased since the probability that the component

actually is free of defects is increased. If a crack is detected, actions are taken if it is causing a risk and the risk level for the component is decreased. In principle, the risk level is always decreased after NDT.

Estimation of the efficiency of NDT, Probability Of Detection (POD), is important in PSA-analyses for calculations of probability of leakage. The probability of leakage, often

calculated using the theory of Probabilistic Fracture Mechanics (PFM), and thereby also the risk level for a component, is sensitive for what testing efficiency that is assumed.

The goal of this study is to estimate POD for NDT performed with ultrasonic testing according to the procedure UT-01, for detection of intergranular stress corrosion cracks in cooling pipes in Swedish nuclear power plants, defined as a function of crack size. POD will be estimated from qualification and MTO data, Man-Technology-Organization, from Swedish

Qualification Center (SQC). The study is part of a research project performed by Det Norske Veritas (DNV), commissioned by the Swedish Nuclear Power Inspectorate (SKI).

(6)

1.1 POD, Probability Of (correct) Detection

Non-destructive testing (NDT) of components in nuclear power plants can be done in different

ways, for example with radiography, eddy current, ultrasonic testing, liquid penetrant, magnetic particle and visual inspection. Ultrasonic testing (UT) is common when searching for surface breaking cracks, so-called Intergranular Stress Corrosion Cracks (IGSCC). The procedure UT-01 gives instructions on how to perform ultrasonic testing.

How likely a component is to cause a leak and rupture, the probability of leakage, can be calculated using the theory of Probabilistic Fracture Mechanics (PFM) with test efficiency as input data. Therefore, estimation of efficiency, the probability to find defects (POD), for each method and procedure of NDT is important.

The efficiency of ultrasonic testing according to the procedure UT-01 has been examined in different studies, however, never based upon Swedish data. It is common to define POD as

(

)

[

c c a h

]

POD=Φ ₁+ ₂ ⋅ln / , (1) where a is crack dept, h is wall thickness and Ф is the Gaussian distribution function, i.e. with relative depth as explaining variable. In a study by Simonen and Woo on Round Robin-trials, [1], the ability of different testing teams was also considered, giving the following equations.

Poor: POD=Φ

(

0.240+1.485⋅ln(a/h)

)

. Good: POD=Φ

(

1.526+0.533⋅ln(a/h)

)

. Advanced: POD=Φ

(

3.630+1.106⋅ln(a/h)

)

.

The teams defined as “good” performed over average, and “advanced” teams represent performance that may be achieved with further improved procedures. The function for the “good inspection team” is what so far has been assumed for estimation of probabilities of leakage and rupture. Figure 1 shows the curves plotted against the relative depth.

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 0,2 0,4 0,6 0,8 1

relative depth of crack

POD

Poor Good Advanced

(7)

Another study, by Simola and Pulkkinen, [2], applied on data from PISC III exercise, resulted in the following Equation

(

)

POD=Φ 164 0 75. + . ⋅ln( / )a h , which is plotted in Figure 2.

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 0,2 0,4 0,6 0,8 1

relative depth of crack

POD POD

Figure 2. The Simola & Pulkkinen study

In this study, a POD function will be estimated from hit/miss data from qualification tests; the environment is more idealistic compared to testing in a nuclear power plant, and POD will probably overestimate the true probabilities. In a nuclear power plant you get exposed to heat, noise, radiation, safety clothing, time pressure - factors affecting the performance. Also, the person qualifying is extra motivated, since passing is the way to get an employment.

However, the estimation give an idea of the efficiency of UT-01.

(8)

2 Description of data

The data used for estimation of POD is from qualification tests performed at Swedish

Qualification Center (SQC). Concerning other questions about the data and the procedure

UT-01, Hans Lundberg at SQC has been consulted.

Besides, data from MTO-studies (Man-Technology-Organization) have been used and Johan Enkvist, Dept. of Psychology, Stockholm University, has been consulted for questions about those.

2.1 Qualification data

Qualification data consists of 117 cracks (IGSCC in austenitic stainless steel); 16 from

Cold-Deformed Elbows (CDE) and 101 from straight pipes with welded joints (Heat Affecting Zone, HAZ). Most of the cracks were manufactured fatigue cracks and then welded into the

test pipes. 14 of the cracks from straight pipes were real IGSCC cracks (Service Induced

Defect, DID).

It contains detection results from 41 people having performed qualification tests at SQC. At the tests, 27 passed at a first try, 13 made the test once more and 8 of them qualified. One person made a successful third try. Of the 41 persons, five never qualified. All together there are results from 55 different performance tests.

2.1.1 Example of qualification data

How well each crack has been detected and characterized at the qualifications is described in data with the following notation (see Table 1 for an example).

X _correct_{; i.e. the crack is detected, identified as a crack and its size and location stated close}

enough to the real one,

FC _{False Call}_{; what is thought to be a crack is incorrectly located where there is no crack,}

0 _wrong_{; an existing crack has not been detected,}

no test; the piece containing this crack has not been examined,

0F correct detection but incorrect characterisation; the crack has been detected but characterized

as a geometrical defect, i.e. not as a crack.

At the qualification tests there are also test pieces without cracks. These, however, are not included in the data and therefore the total number of false calls cannot be revealed, so that some of the qualification results may seem odd at a first glance. The different levels of the qualification results are of no real interest here though. For estimation of POD correct detection (X) is what will be considered as a successful test, and false calls (FC), incorrect characterisation (0F) and wrong detection (0) as an unsuccessful test, see Section 3.2.1.

(9)

Operator AB1 AV2 BH1 BO1 BU1 Result OK OK OK OM1 OMA

Crack nr 21 X X X X 22 X X X X 23 X X X 0F 24 X X X 0F 25 X X X X 26 X FC 0F X 0F 27 X 0 0F X X 28 X X X X X 29 X X X X X 30 X X X X X 31 0 X X X 0 32 X X 0 X 0F 33 X X X X 0 34 X X X X 0F 35 X X X X X 36 0 FC 37 X X X X 38 X X 0 0 39 X FC 0 X

Table 1. Example of qualification data

Depending on the wall thickness of the pipe, the cracks are divided into groups. Group 1: wall thickness <7 mm,

Group 2: wall thickness 7-15 mm, Group 3: wall thickness >15 mm.

The identities of people and cracks are coded for anonymity. Each person is named with two letters followed by a number telling if the test was performed for the first, second or even the third time. The cracks are numbered 1-117. Information whether a qualification test was successful, or for what group of cracks it was not, is given by: OK = qualified, OM1, OM2, OM3 or OMA = missed on group 1, 2, 3 or all three. More about qualification tests and the specific criteria can be read in Appendix A.

2.2 MTO data

In the MTO-data there is a total of 12 cracks and 21 persons from two studies (9/12 cracks and 14/19 operators in study 1 respectively 2). Nine of the cracks were used and 12 of the operators participated at both occasions. The cracks and persons can also be found in qualification data. One year passed between the studies. See Appendix A for more details. The persons that participated in the MTO studies were already qualified operators.

(10)

2.3 Comments on qualification and MTO data

The number of tests performed and the number of successful detections for each crack are given in the data.

Some persons have performed qualification tests two times (one person even three times) and also participated in one or two MTO-studies. One approach is to consider them as different persons at different occasions since their knowledge and skill have improved or changed after studying more, and between the MTO studies one year has passed. Also, the same person has never tested the same test pieces (except for the MTO studies).

Qualification results in themselves (OK, OM1,..) are not an issue here. The qualification criteria also involves frequency of false calls for non-existing cracks, which is not included in data nor of any interest since not used for estimation of POD (there is no crack). Generally, individual detection frequencies are high.

Nine test occasions were removed from data since these (five) persons never qualified. This is because they will never perform any NDT in nuclear power plants and are not considered representative for this study.

The qualification data is sparse with only 3-9 tests per crack, making inference more uncertain. For cracks tested in the MTO studies another 19-33 tests have been made.

(11)

2.4 The cracks

2.4.1 Definitions of the crack variables

To describe the location and size of a crack, a coordinate system is imagined as shown in Figure 3.

Figure 3. Coordinate system

The following variables are defined for each crack:

depth (of crack) Measured as difference in z from inside the pipe.

length Difference in x.

distance The mean of two distances from the centre of the weld (or a reference line) to two points of the crack. (wall) thickness

tilt The angle a crack makes with the z-axis, as shown in Figure 4.

skew The angle between crack and weld (x-axis), also shown in Figure 4.

(12)

CDE Cold-Deformed Elbow; a bent pipe with cracks located in the elbow, see Figure 5. This information was not given in the data but defined after consulting SQC,

Figure 5. Cold-deformed elbow

HAZ Heat Affecting Zone; a straight pipe with cracks in the area where there is a welded joint. This was not given in the data but defined after consulting SQC.

DID Service Induced Defect (Drift Inducerad Defekt); a piece of pipe coming from an authentic pipe from a nuclear power plant containing real IGSCC cracks. All measurements are given in mm and all angles in degrees.

2.4.2 About the variables

All the variables mentioned above, and perhaps even interactions between these, might in different ways influence POD. A simple model for the estimated POD, with few explaining variables is desired. As mentioned in Section 1.1, relative depth of the crack is a variable commonly used, meaning cracks of the same absolute depth from pipes of different thickness are not detected with the same probabilities. The distance between the transmitter and the crack affects the sound waves: for longer distances the waves get more attenuated and makes detection harder, so relative depth might be of more interest than absolute. On the other hand, cracks of the same relative depth might have very different absolute depths; POD ought to be related to the actual crack size as well.

Length also gives information of the crack size. Intuitively depth is more interesting, since

closer related to probability of leakage. The interaction between length and depth will also be considered (the area of a crack).

The distance to the welded joint is not expected to influence POD. Not for the artificial cracks tested at the qualifications at least, since these welds, contrary to real ones, do not reflect the sound waves. In reality, closeness to the weld makes detection hard since signals from cracks might be difficult to distinguish from reflections caused by the weld. This problem does not occur for cracks in elbows.

(13)

returning to the transmitter. This problem might be caused by presence of tilt or skew. However, the procedure UT-01, [3], covers cracks with tilt ±30° and skew ±20°; small indications should be examined more carefully using different transmitters and directions. If there is an effect of tilt and skew it is probably not that obvious in the data for this study, since cracks are expected here when testing and even small signals presumably are paid attention to - the case might be the opposite for NDT in nuclear power plants where cracks are not

expected generally. By the same argument, POD is expected to overestimate the true detection probability for very small cracks.

Cracks from pipes with real IGSCC cracks, DID, are the most interesting ones since they are not artificial. Unfortunately their number is small (12).

2.4.3 The available cracks

The 97 cracks that will be considered for estimation of POD have depths in the interval 2-26 mm, length 12-66 mm, wall thickness 4-35 mm and distance to weld 3-22 mm, shown below in Figure 8.

To look for extreme or unusual cracks, different plots were examined. One crack turned out a bit odd, marked as a square in Figure 6, with no depth defined but at the same time being the longest of all the 117 cracks. DNV’s advice was to remove it when fitting a model. For the rest, a deeper crack generally also is longer. Three more cracks were excluded since they were never tested (in the data set considered for estimation of POD). Cracks from elbows, CDE, all have larger distance variables than the ones from straight pipes, HAZ, symbolised by circles in Figure 7, they also belong to the shallower ones from thinner pipes. The 16 cracks from CDE will also be removed before estimating POD, not only because of the different distances but also since the effect of distance for CDE would be another and because they are different then cracks at HAZ. The fact that CDE components now are being replaced in Swedish nuclear power plants is also considered. A total of 97 cracks will be used for estimation of the POD function. 0 5 10 15 20 25 30 0 20 40 60 80 100 120 length of crack depth of crack

(14)

0 5 10 15 20 25 30 0 50 100 150 200 distance to weld depth of crack

Figure 7. Depth plotted against distance to weld, for all 117 cracks, cracks from CDE symbolised with circles.

A pairwise measurement of association between the variables can be calculated with the so-called correlation coefficient, r. When the correlation is zero between two variables they are uncorrelated, and probably independent of each other (i.e. there is no association). A positive correlation indicates that a large value for one of the variables probably means a large value for the other as well, a negative correlation indicates the opposite. The magnitude of r is bounded by 1.

Correlations between the variables depth of crack, length of crack, distance to weld and thickness of pipe for the 97 cracks considered, are shown in Table 2. Each cell also includes the p-value for the hypothesis test: correlation = 0. A p-value less than or equal to 0.05 means that the hypothesis is probably not correct, and that there is some association between the variables, i.e. that the variables presumably are dependent in some way. Variables are more likely to be correlated the smaller the p-value is. The relations between the variables are also shown in Figure 8, with the 12 DID marked as triangles.

depth length thickness distance

depth 1 length 0.7 1 <.0001 thickness 0.6565 0.45821 1 <.0001 <.0001 distance 0.23145 0.10362 -0.06882 1 0.0225 0.3125 0.5030

(15)

0 5 10 15 20 25 30 0 10 20 30 40 50 60 70 length of crack de pth of c ra ck 0 5 10 15 20 25 30 0 5 10 15 20 25 distance to weld de pth of c ra ck 0 5 10 15 20 25 30 0 10 20 30 4

thickness of pipe wall

de pth of c ra ck 0 0 5 10 15 20 25 0 10 20 30 40 50 60 70 distance to weld le ngth of c ra ck 0 10 20 30 40 50 60 70 0 10 20 30 4

le ngth of c ra ck 0 0 5 10 15 20 25 0 10 20 30 4

d istan ce to w el d 0

Figure 8. Depth, length, distance and thickness plotted against each other for the 97 cracks considered in the data set for estimation of POD.

The 12 Service Induced Defects (DID), are all at most 8 mm deep, their lengths between 28 and 66 mm, located at 10-21 mm distance from weld, of 10 or 16 mm pipes. The number is too small though, to draw any conclusions about true cracks. (Perhaps they generally are a bit longer than artificial cracks of the same depth.)

(16)

3 Method

3.1 Correct detection

Correct detection of a crack is of interest for estimation of POD, i.e. correctly located and

measured. False calls or wrong characterizations are considered as unsuccessful tests; POD is estimated for actual cracks. When finding a crack, action will be taken to reduce the

probability of leakage. If this is done for non-existing or incorrectly located cracks, costly reparations will be done unnecessarily and perhaps a dangerous crack remains. To be able to calculate risk levels and to optimise risk reduction measures, POD is needed.

3.2 Model

3.2.1 Distributions

There are only two possible outcomes when searching for a crack: either it is detected or not. The result can be described with the stochastic variable Wi,k, following the Bernoulli

distribution: ) ( Be ~ ,k i i p W , ), 0 , 0 , ( ) ( otherwise , 0 detection correct for , 1 , F FC X W_i_k    = with ) 1 ( _, = = _i_k i PW p , , ) , ,... 1 ( crack N i=

N =totalnumberof cracks,

, i n k =test(1,..., _i)of cracknr . i n_i =totalnumberof testsof crack

Each crack has been tested several times, giving more information on how easy this crack is to detect. Wi,k is summed over all tests for crack i and the binomial distributed variable Yi is

obtained:

∑

= = ni k k i i W Y 1 , , ) , ( Bin ~ _i _i i n p Y , with, i i i n p Y)= ⋅ ( E and Var(Y_i)=n_i ⋅p_i⋅(1−p_i).

The probability of (correct) detection, pi, is unknown but can be estimated by the observed

relative detection frequency:pˆ_i = y_i/n_i, where is the observed detection frequency. p i

i

y i

(17)

) ( )) ( ˆ ( E p _i = p _i and i i i i n p p pˆ( )) ( ) (1 ( )) ( Var = ⋅ − .

Since different people have examined the same crack, depends on individual abilities as well. Above, no concern is taken regarding different operators. The estimation of POD is supposed to represent NDT performed by qualified operators following the procedure UT-01. Nevertheless, differences between operators will cause some problems when making

inference for POD, which is discussed in Section 3.2.3. ) ( ˆ _i

p

3.2.2 Generalized linear models (GLIM)

To examine how POD depends on size and location of the crack, a regression analysis is performed according to generalized linear models. This is appropriate for the binomial distribution.

The event probability p (actually, the response of the mean) and the explaining variables

1 2 1,x ,...,xR− x 0 ) (p = + g

are related by a so-called link function g:

1 1 2 2 1 1x +β x +...+βR− xR− β

β = Xβ. For the Binomial distribution the so-called probit and logit links are commonly used. More about GLIM and statistical theory can be read in Appendix B.

3.2.3 Overdispersion

As mentioned above, the detection probability actually also depends on the operators performing the tests, not only on the size and location of the crack. This is known from different studies; to develop procedures, strategies and education of personnel is one way to make the human factor less influential. When trying to fit a model for POD considering only the sizes of the cracks (not the operators) the results will be affected by some underlying distribution for the detection probabilities over individuals (or lack of homogeneity): the variance of the response will be larger than would be expected for binomially distributed variables resulting in too many significant parameters. A way to model overdispersion is to introduce a so-called scale parameter, φ, into the variance function: Var . The estimates of the model parameters will be the same as without a scale parameter, it is their standard deviation that changes, affecting their significance. There are different ways to estimate the parameter

2

) (Y =φ⋅σ

(18)

4 Results

Before fitting a model for POD, for example the following questions had to be answered: • Use all data from both the qualification tests and MTO-studies?

• Which cracks to include - should cracks with tilt and/or skew be excluded? What about cracks in elbows?

• Which link function is appropriate?

• Should the variables be transformed by, for example, logarithms?

Depth of crack, length of crack, distance from weld joint, thickness of pipe wall, tilt, skew, DID, HAZ or CDE, detection results from qualifications and MTO studies: this is the information available for 117 cracks. More about the variables is described in Section 2.4. Since data is sparse, DNV’s opinion is that all data should be used, except for data from the persons never passing the qualification test. Also, they find the MTO data to be very

interesting, because these tests are performed by already qualified operators and might be closer to reality. The real IGSCC cracks (DID) are considered as the most interesting cracks. The problem, though, is that the number of cracks in the MTO data and the number of DID is small. 12 cracks were tested at the MTO studies and there are 12 real cracks, which are too few for reliable estimates. The information from qualification data and the manufactured fatigue cracks is richer. Independently of testing situation or type of crack, probabilities of detection in the data are assumed to follow the same POD model, and all data will be used for estimation of POD, with indicator variables for detection results from MTO studies

respectively real IGSCC cracks, defined like

   = cracks artificial for , 0 cracks) IGSCC (real Defects Induced Service for , 1 DID and    = otherwise , 0 studies MTO in the tested cracks for , 1 MTO .

This is a way to use as much information from the data as possible, but still get an idea about a more likely level of POD closer to reality, than if POD would be estimated only from qualification results (which are likely to overestimate the true probabilities). The estimated parameter for the explaining variable will be the same whether DID/MTO = 1 or 0, but the constant will change ( in Equation 1) for different cases (if there is a significant difference in POD for DID/MTO = 1 or 0, so that estimation of the indicator parameters makes sense).

1

c

Cracks with tilt and/or skew are kept in the data. There is no obvious reason to exclude them. Indicator variables for tilt and skew will be estimated if they turn out to have significantly different POD, modeling for the differences in that case. If cracks with tilt and/or skew would give extremely different results this would of course be reconsidered. Also, in a real

(19)

Four cracks are excluded before fitting a model, because of odd size variables and not being tested. Also, the group of 16 cracks from CDE are excluded, for example due to differences in the variables. Excluding CDE also makes the incomplete information of tilt and/or skew for these cracks no longer a problem. All together there are 97 cracks remaining.

First, a model was fitted only to qualification data, to try and keep the data as homogeneous as possible and reduce possible disturbing effects from the MTO data. A 14-parameter model was defined with probit link, and step-by-step the most insignificant variable removed at 5% significance level. The probit link was chosen, since it is appropriate, and also because it has been used in some other studies, see Equation (1). A logit link was also examined, but did not improve the model. The variables were depth, length, distance, thickness, their interactions of second order and indicator variables for tilt, skew and DID. Not surprisingly, this did result in a large model with many significant explaining variables: depth of crack, length of crack, thickness of pipe, distance to weld, interactions: (depth×length), (depth×distance), (thickness×distance), skew and DID at 5% significance level. The reason is probably overdispersion, see Section 3.2.3, which also is hinted by the deviance statistics (D = 145.9545 with 87 degrees of freedom, the ratio should be close to one).

To avoid misleading inference for the regression parameters, a scale parameter to handle the overdispersion was estimated by Williams’ method, from the large model, and the procedure eliminating non-significant parameters carried through once more. Now, the only significant variables turned out to be depth of crack and an indicator variable for DID (at 5% significance level), and the POD function can be written

(

)

POD=Φ c₁+c depth c DID₂⋅ + ⋅₃ . (2) The significant and negative parameter estimated for the DID indicator, see Equation (3), indicates that the authentic cracks are harder to detect than the artificial ones.

Length and thickness, together with the dummy variable for DID, are both strongly significant too. There are strong correlations between these two variables and depth, according to Table 2, meaning that by themselves they will probably all give a similar explanation of POD. No combination of the three size variables or interactions is improving the model significantly. The indicators of tilt and skew are non-significant if added to the model. Depth is most intuitively related to probability of leakage and therefore selected as the explaining variable. The parameters estimated for the model in Equation (2) are

(

depth DID

)

POD=Φ 0.8519+0.0707⋅ −0.7015⋅ , (3)

So, for authentic cracks (DID=1), now with the constant, = 0.8519 – 0.7015, Equation (3) can be written more clearly as

1

c

(

depth

POD=Φ 0.1504+0.0707⋅

)

, (4)

and for artificial cracks (DID = 0), as

(

depth

(20)

To logarithm the depth makes sense since POD then approaches zero for very small depths - a non-existing crack cannot be found. The model does not seem to be affected by the logarithm in other ways; if to transform depth with logarithm or not is a matter of taste. Transformed depth yields

(

POD=Φ c₁+c₂⋅ln(depth)+ ⋅c DID₃

)

, (6) and with estimated parameters we get

(

depth DID

)

POD=Φ 0.5790+0.4422⋅ln( )−0.7173⋅ , giving, for DID = 1,

(

0.1383 0.4422 ln(depth)

POD=Φ − + ⋅

)

, (7)

and for DID = 0,

(

0.5790 0.4422 ln(depth)

)

POD=Φ + ⋅ . (8)

How the logarithm affects the function, can be seen in Figure 9 and Figure 10.

DID = 1, authentic cracks

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD ln(depth), (7) depth, (4)

Figure 9. POD for DID = 1, Equations (4) and (7).

(21)

artificial cracks 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD ln(depth), (8) depth, (5)

Figure 10. POD for DID = 0, Equations (5) and (8).

In Figure 11 the POD curves for DID = 0 and DID = 1 are plotted together to show the differences, and in Figure 12 and 13 they are plotted separately, together with observed (relative) detection frequencies and lower and upper 95% confidence band. The confidence band connects the confidence intervals for each point on the curve. Cracks of the same depth with the same observed detection frequencies are symbolized with circles in Figure 12.

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack PO D p, DID=0 p, DID=1

(22)

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD y/n 2 cracks p, DID=1 L U

Figure 12. POD with confidence band (95%) and observed relative detection frequencies for authentic cracks (DID = 1), Equation (7).

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 10 20 30 depth of crack POD y/n p, DID=0 L U

Figure 13. POD with confidence band (95%) and observed relative detection frequencies for artificial cracks, Equation (8).

Now, considering (6) as the basic model, the MTO data is included, a dummy variable for MTO is introduced and the parameters are re-estimated, giving

(

depth DID MTO

)

(23)

For authentic cracks in a qualification situation (DID = 1 and MTO = 0)

(

0.1218 0.3720 ln(depth)

)

POD=Φ + ⋅ , (10)

and for tests of artificial cracks in an MTO situation (DID = 0 and MTO = 1)

(

0.1488 0.3720 ln(depth)

)

POD=Φ + ⋅ . (11)

For artificial cracks at a qualification situation (DID = 0 and MTO = 0)

(

0.6503 0.3720 ln(depth)

)

POD=Φ + ⋅ . (12)

Note that Equations (10) and (11) are almost identical! Estimated models from data sets with or without MTO data included, for DID = 1 or 0, i.e. Equation (7) and (10) respectively Equation (8) and (12), are almost identical, as shown in Figure 18.

Equations (10), (11) and (12) are plotted together against depth of crack in Figure 14, and separately in Figures 15-17. Cracks of the same depth with the same observed detection frequencies are symbolized with circles in Figure 15.

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD p, (12) p, (11) p, (10)

(24)

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD y/n 2 cracks p, (10) L U

Figure 15. POD for authentic cracks in a qualification situation, Equation (10).

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD y/n p, (11) L U

Figure 16. POD for artificial cracks in an MTO situation, Equation (11).

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD y/n p, (12) L U

(25)

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD p, (7) p, (10) 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 5 10 15 20 25 30 depth of crack POD p, (8) p, (12)

Figure 18. POD estimated for authentic cracks in a qualification situation with or without MTO in the data set, Equation (7) and (10), and estimated for artificial cracks in a qualification situation with or without MTO in the data set, Equation (8) and (12).

If cracks from cold-deformed elbows also are included, POD does not seem to be different for straight or elbows, when the variables ln(depth), DID and MTO are considered. An indicator variable for cracks in elbows gets non-significant. The estimated parameters, for all the 113 cracks, are almost the same as before, with

(

depth DID MTO

)

POD=Φ 0.5496+0.4122⋅ln( )−0.4862⋅ −0.4623⋅ .

(26)

5 Discussion and conclusions

The goal of this study was to estimate the probability of detection for cracks (intergranular stress corrosion cracks) in pipes in Swedish nuclear power plants, when using ultrasonic testing. Different factors might affect how likely a crack is detected. In this study the following variables were considered: depth of crack, length of crack, distance to weld joint, thickness of pipe wall, whether the crack was authentic (DID) or artificial (as most of the cracks in the data), whether the crack had tilt and/or skew, for explanations see Section 2.4.1. That was the information given for the cracks in the data used for the estimation. The data included cracks from both straight pipes and elbows. The cracks from so-called cold-deformed elbows were excluded though, since these cracks are different than the rest. The data consisted of detection results from qualification tests and MTO studies performed at SQC, Swedish Qualification Center. Detection results from 97 cracks (from straight pipes) were used for the estimation.

The statistical model used was generalized linear models, with probit link function, which is suitable for binomially distributed data. At first, a 14-parameter model was defined for only the qualification data (to reduce possible effects from the MTO data), with the variables mentioned above and interactions between the first four. Step-by-step, the most

non-significant variable was removed, at 5% significance level. This, however, resulted in a large model, with 10 significant parameters. The reason for this is probably overdispersion, due to the fact that the probabilities presumably depend on individual operators as well. To model for overdispersion, a scale parameter, φ, was estimated by Williams’ method, see Section 3.2.3. After introducing the scale parameter, the resulting model was described with depth of crack as explaining variable, with significantly lower detection probabilities for real IGSCC (DID) than artificial cracks.

If investigating the variables depth, length, thickness of pipe wall and distance to weld one and one together with an indicator variable for DID (since the POD is different for authentic and artificial cracks), they all, except distance, turn out to be significant. However, since none of the three different models is better than the other, the level of explanation is about the same for all three models and since depth, length and thickness are strongly correlated (meaning that by themselves they will probably all give a similar explanation of POD), see Section 2.4.3 and Table 2, depth is chosen as explaining variable – depth is also more intuitively related to probability of leakage. A model with depth (and DID-indicator) is not improved by adding any of the other variables or interactions.

A common transformation of size variables is the logarithm, which was considered

appropriate for the fitted model, since the function then is close to zero for very small cracks; for estimated parameters, see Equations 7-8 and Figures 11-13.

The situation when performing ultrasonic testing at a qualification test is different compared to when testing at nuclear power plants. The environment is laboratory without disturbing factors as time pressure, noise, heat, and so on, and the operator is probably extra motivated and careful when qualifying, in order to pass the test. Also, cracks are expected at

qualification tests, contrary to when testing in nuclear power plants. Detection probabilities estimated from qualification data are therefore likely to overestimate the true probabilities. The fact that most of the cracks in the data are artificial also might affect the estimated

(27)

Data from MTO studies were assumed to be closer to reality than qualification data. These tests were performed in the same environment and on some of the cracks (mostly artificial) also tested at the qualification tests, but by already qualified operators, i.e. without having to pass any qualification test. As a next step, the MTO data were added to the qualification data to investigate if there was any significant difference between detection probabilities

depending on situation (for a model where ln(depth) was considered as explaining variable). Detection probabilities turned out to be lower for the MTO data, as expected, see Equations 10-12 and Figures 14-17.

MTO data and detection results for real IGSCC cracks (DID), which were considered more interesting than qualification data and artificial cracks, were unfortunately sparse. There were only 12 cracks in the MTO data and 12 real IGSCC all together. The small numbers make the inference insecure; a single crack might affect the estimation a lot.

The fact that the number of tests performed for each crack generally is small, with 3-9

tests/crack in qualification data, also makes the inference insecure, since the random variation in the data is large. Perhaps this is the reason for the remarkably low level of model fit for estimated models.

For the estimated probabilities, it is important to be aware of the following, which may make the estimates insecure:

• Most of the cracks in the data were artificial.

• The environment where the tests were performed was laboratory. • Data from real IGSCC (DID) and MTO studies were sparse. • Data generally were sparse.

• DID and cracks tested at MTO studies have depths less than or equal to 8 mm.

• The assumption that detection probabilities in an MTO situation have the same model as detection probabilities in a qualification situation might be questioned.

It could be of interest in future studies to investigate why real IGSCC cracks (DID) seem to be harder to detect than the artificial ones, tested at qualifications. Also, to consider if

qualification tests are representative for true tests. In future studies, the different variables could also be investigated more, and, for example, perhaps truncated for large size values, or transformed in another way.

(28)

References

[1] Simonen F.A., Woo, H.H., Analyses of the Impact of Inservice Inspection Using a Piping Reliability Model, NUREG/CR-3869, 1984.

[2] Simola, K., Pulkinnen, U., Statistical Models for Reliability and Management of Ultrasonic Inspection Data, Report No. KUNTO(96)10, VTT Automation, Finland, 1996.

[3] Provningsprocedur för manuell ultraljudsprovning av rör och komponenter, UT-01 rev 0. (Testing procedure for manual ultrasonic testing of pipes and

components, UT-01 rev 0.)

[4] Enkvist, J., Edland, A., Svenson, O., Effects of Time Pressure and Noise in Non-Destructive Testing, SKI Report 01:48, Statens Kärnkraftinspektion, 2001. [5] Enkvist, J., Edland, A., Svenson, O., Operator Performance in Non-Destructive

Testing: A Study of Operator Performance in a Performance Test, SKI Report 00:26, Statens Kärnkraftinspektion, 2000.

[6] www.ski.se

[7] www.dnv.com

[8] www.sqc.se

[9] Brickstad, B., Zang, W., NURBIT, Nuclear RBI Analysis Tool, A Software for Risk Management of Nuclear Components, Technical Report No.10334900-1, DNV, Stockholm, Sweden, 2001.

[10] Sen, A., Srivastava, M., Regression analysis, Theory, Methods, and Applications, Springer-Verlag, 1990.

[11] McCullagh, P., Nelder J.A., Generalized Linear Models, 2nd edition, Chapman & Hall, London, 1989.

[12] Olsson, U., Generalized linear models: an applied approach, Lund, Studentlitteratur, 2002.

(29)

Appendices

A. Terminology

PSA, Probabilistic Safety Assessment

PSA is the dominating method for estimation of risk levels for components and systems, and of probability for core damage, in a nuclear power plant.

The conditioned probability of core damage or radioactive radiation given fracture or leakage,

C = P(core damage | leakage), can be estimated.

Risk reduction measurements are planned and evaluated considering the risk level of each component and theory for crack growth, for example repairs, where and how often NDT should be performed, improvement of security systems and leakage control, and re-constructions.

Results from this study will be used by DNV for risk evaluation using the RBI-code NURBIT, [9], with Risk Reduction Factor (RRF) defined as

RRF CDF no ISI)

CDF(with ISI using inspection interval t)

= (

∆ , (13)

where CDF is the estimated probability of core damage (per year),

CDF= P(small leak)⋅C(small leak) + P(large leak)⋅C(large leak) +P(rupture)⋅C(rupture).

NDT, Non-destructive testing

Without destroying pipes and components, cracks and defects in metallic materials can be detected using different techniques, for example with radiography, eddy current, ultrasonic testing, liquid penetrant, magnetic particle and visual inspection. Characteristics for these methods (except for ultrasonic testing) will not be mentioned here. Specially trained personnel perform NDT during the summer shutdown when the nuclear power plant is not running for about 3-5 weeks for inspection and service, the so-called revision. The testing techniques have different qualities and are suitable in different situations; they might also be a complement to each other. The risk level for a component is always reduced after NDT. If a defect is

detected, some action will be taken, either it will be repaired, or, if not so risky, a returning test planned; anyway, the probability of leakage is reduced. If no defect is detected, the

probability that the component does not contain any defects is increased and the probability of leakage reduced.

IGSCC, Intergranular Stress Corrosion Cracking

(30)

this study is so-called Intergranular Stress Corrosion Cracks (IGSCC) in stainless steel, surface-breaking cracks. Detection and characterization for NDT of IGSCC is described in the procedure UT-01, [3].

IGSCC is the most frequent type of defects in pipes and in boiling water reactors in Swedish nuclear power plants. About two risky cracks are detected at each revision. The risk of damage is increased for older components.

UT-01, Ultrasonic Testing procedure

To find cracks of type IGSCC, ultrasonic testing (UT) is common. At manual UT a small transmitter is placed on the pipe and a specific area is examined (mentioned in the sections about HAZ and CDE), transmitting sound waves with short wave length and high frequency into the material. If stopped by irregularities, the waves are reflected back to the transmitter and signals are shown on a monitor.

The procedure UT-01, [3], gives instructions on how to actually perform UT: how to detect defects, criteria for indications when to do a more careful examination, how to calibrate the equipment and so on. When a defect is detected and characterized as a crack, the size and location is measured.

Defects considered in UT-01 are surface breaking planar defects, i.e. cracks of type IGSCC, with a detection target of 2 mm and determination of depths at wall thickness > 7 mm. Cracks with tilt ±30° and with skew ±20° (meaning the crack is not parallel to a reference line, shown in Figure 4) are considered. The direction to search with the transmitter is angular to the weld for straight pipes and longitudinal and radial at elbows.

During NDT, several transmitters with different angles of transmission, frequencies and wavelengths might be used to get a maximum of information. To calibrate the equipment accurately is important but very time-consuming.

Mechanical ultrasonic testing has developed more and more. 80-90% of all NDT with

ultrasonic testing are performed mechanically today, giving better results than manual testing in the normal case. In some situations, manual testing is the only and cheapest alternative but when possible testing is done mechanically, especially in environments with high radiation and inside reactors or in water.

Round Robin-trials

Study performed in several groups/countries on the same test blocks, using the same or different methods, [1].

PISC

(31)

Qualification

Skilled operators are very important for a good result of NDT. This is why operators have to qualify before employment. First, there are courses to attend and then you have to pass qualification tests, where in a practical way demonstrating your knowledge about calibration, detection, characterization and estimation of crack size.

In 1998/1999 the first qualification tests took place in Sweden. Before, you did not have to be qualified to perform NDT at the nuclear power plants. A total of 42 persons have qualified since then. Today (December 2003), there are 24 qualified operators in Sweden, 14

qualifications will cease before summer. Every fifth year the qualification has to be updated and performance tests carried through once more. In between you have to be active as an operator.

At qualification, taking place during 5-7 days at SQC, pieces with about 20 cracks are tested (see Table 1). Detection of cracks is time consuming: about three pieces are tested per day - corresponding for NDT in a nuclear power plant is two “pieces” per day; this gives an idea about the importance of an optimally performed PSA and calculations where to perform NDT. Some of the test pieces do contain more than one crack, other none at all. The number is unknown to the operator. One of 6-8 transmitters is also calibrated during the qualification; to save time the rest are calibrated in advance at home. To make the environment as real as possible the test pieces are mounted at different heights, in a rack.

If not qualified at a first try, it is possible to make another after one month. The performance tests are judged all together, but also for each of the three groups, which the cracks are divided into according to wall thickness (mentioned below). At a second qualification you only redo tests on the corresponding group in which you failed the first time. This is to save time and money. If not passing the second test, a new qualification can be done after one year; then tests are performed on all three groups. The same person never tests the same test pieces when returning to SQC for a new qualification.

To be qualified you have to:

• correctly detect and characterize at least 70% within a group, • correctly detect and characterize at least 80% of all defects, • report not too many false calls.

A crack is considered correctly detected and characterized if reported as an IGSCC and placed more than 50% within a so-called hit box, surrounding the true crack by ±10 mm. Also, the tolerance of length measurement is ±20 mm and for depth ±2 mm/ ±3 mm for wall thickness ≤15 mm respectively >15 mm.

At the qualification tests, there are between 4-12 cracks in each group, mentioned above, which are defined as:

Group 1: wall thickness <7 mm, Group 2: wall thickness 7-15 mm, Group 3: wall thickness >15 mm.

(32)

MTO, Man-Technology-Organization

Operators performing NDT have to keep a high level of concentration and attention, not to miss any indications of cracks and defects. Pressure of time, noise, heat, motivation, individual decision-making, long working hours are example of factors affecting the

performance; both physical and psychological. In what extension stress is affecting and how it is dealt with is individual. Differences between operators, despite the same equipment and procedure (strategy) are a known fact. Also, the same individual might solve a problem

differently depending on the situation. It is known from international studies that operators are performing NDT somewhat differently depending on qualification or real situation. At a qualification test, cracks are expected unlike when testing in a real environment; at the nuclear power plants there are no cracks in general. Furthermore, when detecting, most of the small indications derive from defects and irregularities in the material, not cracks. The operators are aware of this and do probably not pay as much attention to small indications when testing for real as at the qualification tests. POD estimated from qualification tests will probably

overestimate the true detection probabilities for small sizes. However, not to find small cracks is not that dangerous, it is the large ones causing a greater risk.

In two Swedish studies, [4] and [5], performance of NDT of IGSCC for UT-01 depending on working environment was examined in a psychological aspect. Already qualified operators (these people can also be found in qualification data) performed ultrasonic tests at SQC on a selection of the test pieces also used at the qualification tests. The first study examined the effect of attitude and opinion of importance of specific information for detection and

characterization. Operators who used more time for both detection and characterization, with more regard to information when detecting and less when characterizing had the best results. The second one examined the effect of time-pressure and noise: the operators did perform tests both under stress, i.e. noise and limited time, and non-stress and the results were

compared. The operators turned out to get more focused and motivated when exposed to time pressure and noise, resulting in better tests. Data from these two MTO studies are also part of data in this study. Since the tests are performed by operators already qualified, they might be closer to reality and therefore of more interest than the other data. The problem, though, is that the data are few.

HAZ, Heat Affecting Zone

When two pipes are joined by a weld the material is affected by heat and cracks might appear. At detection, the areas on both sides of the weld are tested according to UT-01:

• ±25 mm for 3.9-6 mm wall thickness, • ±15 mm 6-11 mm wall thickness, • ±10 mm 11-40 mm wall thickness.

CDE, Cold-Deformed Elbows

(33)

DID, Service Induced Defects (Drift Inducerade Defekter)

The notation of real IGSCC cracks in pipes from a nuclear power plant. Most of the cracks in the data are manufactured fatigue cracks welded into the test pipe.

DNV, Det Norske Veritas

DNV was established in 1864. It is an independent foundation with the objective of

safeguarding life, property and the environment. DNV is an international company with 5500 employees and 300 offices in 100 different countries, with headquarter in Oslo, Norway. DNV operates in multiple industries internationally, but in four industries they have a strong market presence and a large customer base. These industries are:

• Maritime, • Oil & Gas, • Process,

• Transportation (Rail and Automobile). For more information, see [7].

Commissioned by SKI, DNV performs different research projects about security of systems and components in Swedish nuclear power plants. For example, this study is part of one such project.

SQC, Swedish Qualification Center

SQC is an independent, accredited qualification organ, qualifying personnel, techniques and procedures for different types of testing of components in Swedish nuclear power plants. SQC is owned by OKG Aktiebolag (OKG), Forsmarks Kraftgrupp AB (FKA), Barsebäck Kraft AB (BKAB) and Ringhals AB. See [8], for further information.

Data in this study are from qualification tests and MTO studies performed at SQC. Identities of people and defects are coded for anonymity.

SKI, Swedish Nuclear Power Inspectorate (Statens Kärnkraftinspektion)

SKI is the authority regulating and supervising all the nuclear activities in Sweden: nuclear fuel manufacture, nuclear power plant operation, transports and waste management. The authority’s mission is to ensure that the owner takes the full responsibility for safe operation, which is stipulated to the holder of a license to conduct nuclear activities. SKI also finances and conducts research and development into nuclear issues.

SKI works on behalf of the Government and reports to the Ministry of the Environment. It was formed in 1974 and currently has about 115 employees. For more information about SKI, see [6].

Nuclear Activities in Sweden

(34)

pressurized water reactor type. Figure 19 shows the location of the nuclear facilities in Sweden, (taken from [6]).

Nuclear power accounts for about half of the electricity generated in Sweden.

Severe incidents are very uncommon in Sweden and radioactive releases that exceed the limits have so far not occurred.

(35)

B. Statistical theory

GLIM, Generalized Linear Models

The statistical theory of General linear models (GLM), [10], is used for regression analyses when data follow a Normal (Gaussian) distribution. The relationship between the response y (the variable of interest) and the explaining variables x is expressed as a linear function (in matrix terms): y=Xβ+e, with independent normally distributed residuals e, with constant variance. The mean value E(y)=Xβ=µ, is called the linear predictor. In reality other distributions are often the case, for example the Binomial distribution.

A more extended theory is Generalized linear models (GLIM), [11], [12], [13], concerning the whole exponential family of distributions. A so-called link function is now explaining the mean linearly by the linear predictor: g(µ)=Xβ.

The exponential family

The distribution of the variable Y belongs to the exponential family if the density can be written as       − ₊ = ( , ) ) ( ) ( exp ) ( φ φ θ θ y c a b y y f , where ) ( E of function parameter, canonical = y = µ θ , parameter dispersion = φ ,

a, b, c some functions (a is often the identity function).

The parameters θ and φare estimated with the maximum likelihood method (i.e. for a specific assumption of distribution they are defined in such a way that the probability for the observed result is maximized).

The Binomial, Gamma, Poisson or Normal distributions are examples of distributions belonging to the exponential family.

Log likelihood function

For a distribution in the exponential family, the log likelihood function can be written ) , ( ) ( ) ( )) , , ( ln( φ φ θ θ φ θ c y a b y y f l = = − + .

(36)

which together with ) ( ) ( ' φ θ θ a b y l ₌ − ∂ ∂ and ) ( ) ( '' 2 2 φ θ θ a b l − = ∂ ∂ , from l, lead to the following expressions of mean and variance,

) ( ) ( EY =b′θ , ) ( ) ( ) ( VarY =a φ ⋅b′′θ .

As an example, the Binomial distribution, Y~Bin (n, p) with E(Y)=n⋅p, can be defined by

(37)

            + + ⋅ − ⋅ = y n n y p y f( , ) exp θ ln(1 exp(θ)) ln .

The mean and variance of the Binomial distribution are then given by

( )

_{( )}

, exp 1 exp ' EY b n =n⋅p + ⋅ = = θ θ θ

( )

[

₍

(

_{( )}

( )

₎

)

( )

]

(

) (

1 exp( )

)

(1 ). 1 ) exp( 1 ) exp( exp 1 exp exp 1 ) exp( '' Var ₂ p p n n n b Y − ⋅ ⋅ = + ⋅ + ⋅ = = + − + ⋅ ⋅ = = θ θ θ θ θ θ θ θ Link function

The link function g(µ)=Xβ must be monotone and differentiable. The choice of link function depends on the type of data. Each distribution in the exponential family has a so-called canonical link, in the same form as the canonical parameter, g(µ)=θ. However, the canonical link is not necessarily always the best.

For the response y/n, where Y∼Bin (n,p), the mean is: µ = p. Common for the Binomial

distributions are the links:

probit: , _g(_p)₌_Φ−1(_p)

logit: g

( )

p =log_₁₋p_p_, (canonical link function) CLL: ))g(p)=log(−log(1−p . (Complementary Logit Link)

Their inverses, g-1_{, restrict the mean to the interval [0,1]: g}-1_{(g(p)) = p}_{∈ [0,1]. This is}

appropriate when the response, i.e. the estimated probability, only can take these values.

Estimation of parameters

In what way the explaining variables are affecting the response of the model is examined by estimating the parameters β

1 2

1,x ,...,xR−

x

0 ,…, βR-1 with the log likelihood method. For a

single observation, as before,

) , ( ) ( ) ( )) , , ( ln( φ φ θ θ φ θ c y a b y y f l= = ⋅ − + .

The value for which the derivative of l with respect to the parameter βj , equals zero

maximizes l and gives β∃_j, estimates for βj , j = 0,.. ,R-1. Since θ is a function of µ,

(38)

j j l l β η η µ µ θ θ β ∂ ∂ ⋅ ∂ ∂ ⋅ ∂ ∂ ⋅ ∂ ∂ = ∂ ∂ , where, b′ )(θ =µ,

b''(θ)= _∂∂_θµ =V (the variance function),

j R j j x Xβ β η

∑

− = = = 1 0 with _j j x = ∂ ∂ β η and V ⋅       ∂ ∂ = − 2 1 µ η W .

This gives the expression

j j j x y a W x V a y l _⋅ ∂ ∂ ⋅ − ⋅ = ⋅ ∂ ∂ ⋅ ⋅ − = ∂ ∂ µ η µ φ η µ φ µ β ( ) ( ) 1 ) ( .

By summing over all observations (i = 1, .. , N),

∑

⋅ ∂ ∂ ⋅ − ⋅ = ∂ ∂ i j i i i i i i j x a y W l , ) ( µ η φ µ β .