• No results found

Research Report Statistical Research Unit Department of Economics Göteborg University Sweden

N/A
N/A
Protected

Academic year: 2021

Share "Research Report Statistical Research Unit Department of Economics Göteborg University Sweden"

Copied!
30
0
0

Loading.... (view fulltext now)

Full text

(1)

Research Report 2007:1 ISSN 0349-8034 Mailing address: Fax Phone Home Page:

Research Report

Statistical Research Unit

Department of Economics

Göteborg University

Sweden

Effect of dependency in systems for

multivariate surveillance

(2)

Effect of dependency in systems for multivariate surveillance

E Andersson

Statistical Research Unit, School of Economics and Commercial Law, Göteborg University, Sweden.

________________________________________________________________________

Abstract

In many situations we need a system for detecting changes early. Examples are early detection of disease outbreaks, of patients at risk and of financial instability. In influenza outbreaks, for example, we want to detect an increase in the number of cases. Important indicators might be the number of cases of influenza-like illness and pharmacy sales (e.g. aspirin). By continually monitoring these indicators, we can early detect a change in the process of interest. The methodology of statistical surveillance is used. Often, the conclusions about the process(es) of interest is improved if the surveillance is based on several indicators. Here three systems for multivariate surveillance are compared. One system, called LRpar, is based on parallel likelihood ratio methods, since the likelihood ratio has been shown to have several optimality properties. In LRpar, the marginal density of each indicator is monitored and an alarm is called as soon as one of the likelihood ratios exceeds its alarm limit. The LRpar is compared to an optimal alarm system, called LRjoint, which is derived from the full likelihood ratio for the joint density. The performances of LRpar and LRjoint are compared to a system where the Hotellings T2 is monitored. The evaluation is made using the delay of a motivated alarm, as a function of the times of the changes. The effect of dependency is investigated: both dependency between the monitored processes and correlation between the time points when the changes occur. When the first change occurs immediately, the three methods work rather similarly, for independent processes and zero correlation between the change times. But when all processes change later, the T2 has much longer delay than LRjoint and LRpar. This holds both when the processes are independent and when they have a positive covariance. When we assume a positive correlation between the change times, the LRjoint yields a shorter delay than LRpar when the changes actually do occur simultaneously, whereas the opposite is true when the changes do actually occur at different time point.

Keywords: Multivariate; Surveillance; Dependency; Optimal; Covariance; Likelihood

(3)

1. Introduction

In many situations it is important to monitor a process in order to detect an important change as soon as possible. Examples are turning point detection in business cycles (Neftci (1982), Hamilton (1989), Andersson et al. (2004)), in hormone cycles (Royston (1991)), in influenza cycles (Baron (2002)) and in financial cycles (Bock (2003)). Another area is detection of growth retardation of foetuses (Petzold et al. (2004)). Yet another is detection of an increased level, emerging from a source and spreading spatially (Järpe (1999)). In on-line monitoring we have repeated decisions: at each time point, a new observation becomes available and a new decision is made as to whether the process has changed or nor. The methodology of statistical surveillance is appropriate.

Statistical surveillance is a methodology for discriminating between two events, ”the change has occurred” and ”the change has not occurred”. The time of change is unknown. The time scale can differ between applications (days, weeks, months), but common to all surveillance are the repeated decisions, made at each time point. The decision is made using an alarm statistic and an alarm limit. In industrial quality control charts (e.g. xbar-charts), the decision rule can be that an alarm is given as soon as an observation crosses the alarm limit (Shewhart (1931)). There is always a risk for a false alarm, but the parameters of the surveillance method are chosen so that we know the false alarm property. For motivated alarms, i.e. when the change actually happens, we want a quick detection. Since we have repeated decisions, hence size and power are not appropriate measures. Instead we have a trade-off between false alarms and delay of motivated alarms or detection probability. The false alarms are often controlled by a fixed average run length, ARL0. The delay can be measured by the expected delay time between the change time and the alarm. Besides the Shewhart method, the EWMA method (Roberts (1959)) and the CUSUM method (Page (1954)) can be mentioned.

In many situations we monitor several processes, which may change simultaneously or at different times. There are different approaches in multivariate monitoring: the data can be reduced to a scalar at each time or we can use separate alarm systems for each process in combination with an inference rule like union intersection. Also, multivariate versions of univariate methods have been suggested, for example MEWMA and MCUSUM. The choice of method depends on e.g. whether the processes under surveillance are correlated and whether the change times are dependent.

The aim of this paper is to compare different multivariate surveillance systems. In Section 1, optimal univariate surveillance is discussed and different approaches to multivariate surveillance are exemplified. This section also includes some previously suggested methods for multivariate surveillance. Section 2 presents the surveillance methods, used in this paper. In Section 3 and 4, the results of the study are presented and section 5 contains a discussion.

(4)

alarm system we decide whether D or C is the most likely. Optimal alarm systems are based on the likelihood ratio between C and D (Shiryaev (1963), Frisén and de Maré (1991)), see further in Section 1.2.3.

Statistical surveillance is used for on-line detection of an important change in the underlying process, for instance the expected value. For daily data, a new decision is made each day, based on the available data. When there is enough evidence of a change, an alarm is called. The surveillance system (alarm statistic and alarm limit) is specified to have a known false alarm risk by adjusting the alarm limit.

Generally, we get a new observation on the observed process X at each time s={1, 2, …} and the surveillance system is used to discriminate between events C(s) and D(s). At an unknown time, τ, there is a change in the distribution of X, so that

( ( ) ), ( ) ( ( ) ), f x s D s X s f x s C s τ τ ⎧ < ⎪ ⎨ ⎪⎩ ∼

The events D and C are specified according to the change of interest.

As mentioned above, the likelihood ratio between C and D is optimal. If we want to detect a change at the current time s, then C={τ=s}, D={τ>s} and the likelihood ratio is

( ) ( ) S s f x s k f x s τ τ = > > ,

where xs={x(1), x(2), ..., x(s)} and k=alarm limit. E.g. if it is of interest to detect a change in the expected value (from μ0 to μ1) and the observations are independent and normally distributed, then the optimal LR reduces to the last observation, x(s)

1 0 ( ( ) ) ( ( ) ) f x s f x s μ μ μ μ = = >k <=> ( )x s > 2 0 2 1 2 1 0 2 ln( ) ( ) ( ) 2( ) k σ μ μ μ μ ⋅ − − − .

This is the Shewhart method, used in e.g. an xbar-chart. Contrary to using only the latest observation, the CUSUM method cumulates the obsrvations since the start of the surveillance. CUSUM is the maximal of the likelihood ratios at the decision time s. Another surveillance method is EWMA, where all observations are weighted together. EWMA can not be exactly derived from the likelihood ratio, but for certain values of the smoothing constant, EWMA is approximately the same as LR (see Frisén and Sonesson (2002)).

For the situation when it is important to detect if there has been a change at some time point since the start of the surveillance, we specify C ={τ≤s}={{τ=1}, {τ=2}, ..., {τ=s}}. Then the LR consists of s components

1 ( 1) 2 ( 2) ... ( ) ( ) ( ) ( ) S S S s s s s s f x f x f x s w w w k f x s f x s f x s τ τ τ τ τ τ = = = + + + > > > > , (1)

where wi=P(τ=i)/P(τ≤s) and ks = k⋅P(τ≥s)/P(τ<s). Thus for C={τ≤s}, it is optimal to use all observations {x(1), ..., x(s)}. The time of alarm, tA, is defined as the first time that the alarm statistic exceeds the alarm limit,

(5)

When P(C) = 1-P(D), the likelihood ratio is equivalent to the posterior probability, with alarm rule

P C x( s)>k,

where k is the constant alarm limit. The posterior probability is often used in hidden Markov model approaches (HMM).

The alarm limit is adjusted so that the false alarm property has a specified value. In quality control, a common false alarm property is the average run length to the first false alarm, ARL0=E[tA⏐τ=∞]. Gan (1993) instead suggested the use of the median run length (MRL0). In many theoretical work, the false alarm probability is used, which summarize the false alarm distribution using the distribution of τ, such that

PFA= 1 ( A ) ( ) i P t i Pτ i ∞ = < ⋅ =

.

For an on-line system, the ability to detect a change quickly is important, i.e. to have a short delay for motivated alarms. For most surveillance methods, the delay of an alarm depends on when the change do occur, in relation to the start of the surveillance and the delay is often longest when the change occurs at the start (τ=1). The conditional expected delay of an alarm (see Frisén and Wessman (1999)) is defined as

CED=E t[ A−τ tA ≥τ τ, =t].

Many evaluations are made using τ=1, e.g CED(1) which is equivalent to ARL1-1.

However it is important to consider other change point times also.

1.2 Different approaches to multivariate surveillance

In a multivariate setting there are p processes {X1, X2, ..., Xp} for which ( ( ) ), ( ) ( ( ) ), j j j j j f x s D s X s f x s C s τ τ ⎧ < ⎪ ⎨ ⎪⎩ ∼ where j={1, 2, ..., p}.

(6)

1.2.1 Reduction to one scalar statistic at each time

The p processes can be reduced to a scalar at each time, for example by calculating the (weighted) mean. Wessman (1998) showed that when the processes have identical change times (τ1=τ2=...=τp=τ), then there exists a sufficient univariate reduction of the variables {X1, ..., Xp}. Thus, without loss of information, the multivariate data can be reduced to a scalar statistic and then univariate surveillance can be applied. The sufficiency also holds when the changes times are not identical but the lag times between them are known. 1.2.2 Parallel surveillance

Multivariate surveillance can be made by monitoring the marginal density of each process. One drawback is that no information about the dependency structure is used. The surveillance system for X1 is only concerned with τ1, and correspondingly for X2, X3 etc. For process Xj at time s, we have alarm statistic p(xjs) and alarm limit kjs, so that an alarm is called when

p(xjs) > kjs.,

where xjs={xj(1), xj(2), ..., xj(s)}. The time of alarm of Xj is tAj=min t: p(xjt) > kjt.

The time of alarm for the whole system is defined as

tA = min {tA1, tA2...}. (2)

Does et al. (1999) uses separate surveillance for of each process (there each principal component) in a case study.

1.2.3 Simultaneous solution

It was shown by Shiryaev (1963) that the likelihood ratio between C and D is optimal in the sense that is maximized the expected utility, E[u], where u equals

(

)

(

A

)

A A 1 A 2 A h t -τ , t τ u(t , τ) a t -τ a , t τ < ⎧⎪ = ⎨ + ⎪⎩

The optimality was proven for the situation when τ follows a Geometric distribution (with parameter ν). The utility expression above is easily applied to the univariate situation. If the function h(tA - τ) is a constant, b, then it has been shown that the utility is maximized when the delay, tA-τ, is minimized (see Frisén (2003)). The minimal expected delay was shown to hold also for a situation where τ is not Geometrically distributed (Andersson (2004)).

(7)

Then the utility can be written as

(

)

(

)

A (1) A (1) A (1) 1 A (1) 2 A (1) h t -τ , t τ u(t , τ ) a t -τ a , t τ ⎧ < ⎪ = ⎨ ⋅ + ≥ ⎪⎩

where tA is the time of alarm for the whole system (e.g. (2) in Section 1.2.2). The distribution of τ(1) depends on the p-variate distribution for (τ1, ..., τp). For p=2, we use the bivariate Geometric distribution, see e.g. Marshall and Olkin (1997). Then τ(1) follows a Geometric distribution (Sun and Basu (1995)) and the optimality holds: the likelihood ratio maximizes the expected utility.

Wessman (1999) investigated the situation where it is of interest to decide whether the first change has occurred at the current time point, C={τ(1)=s}. In this paper we want to decide whether the first change has started at any time point since the beginning of the surveillance, C={τ(1)≤s}={{τ(1)=1},...,{τ(1)=s}}. The D event is specified as {τ(1)>s}. The optimal surveillance system (simultaneous solution) is derived from the likelihood ratio. Since P(C)=1-P(D), the likelihood ratio is equivalent to the posterior probaility

P C m( s)>k,

where ms={m(1),..., m(s)}={{x1(1), ..., xp(1)},..., {x1(s)..., xp(s)}} and k=constant alarm limit. Since P(C) = 1-P(D), the alarm rule is

( ) ( ) ( ) s s s P m C k P m C P m D> ∩ + ∩ .

Hence forward we express the likelihood function by f(*) instead of P(*), so that

( ) ( ) 1 s s f m C k f m D k ∩ > ∩ −

We specify C={τ(1)≤s} and D={τ(1)>s} and the likelihood ratio alarm rule consists of partial likelihood ratios

(1) (1) (1) 1 2 (1) (1) (1) ( 1) ( 2) ( ) ... ( ) ( ) ( ) s s s s s s s s s s f m f m f m s w w w f m s f m s f m s τ τ τ τ τ τ = = = ⋅ + ⋅ + + ⋅ > > > >ks, expressed as 1 ( , ) s j s j w L s j = ⋅

.

The weights, wts=P(τ(1)=t)/P(τ(1)≤s), and the limit, ks=k’⋅P(τ(1)>s)/P(τ(1)≤s), depends on the distribution of τ(1), which in turn depends on the joint distribution of (τ1, ..., τp). The time of alarm is defined as

(8)

1.3 Suggested approaches for multivariate surveillance

Some of the suggested approaches for monitoring multivariate data are reviewed. 1.3.1 Multivariate monitoring using a reduction

As mentioned before, Wessman (1998) showed that when the changes occur simultaneously, there exists a sufficient reduction of the data. Multivariate data can e.g. be reduced by the Hotellings T2 statistic or by principal component analysis.

When the changes occur at different time points, a problem with a reduction is to determine which variable that causes the alarm. Javaheri and Houshmand (2001) suggest a follow-up with discriminant analysis to investigate which variables that cause the alarm. Jolayemi (2000) constructs multiple control regions for two assignable causes. Kalagonda and Kulkarni (2003) propose constructing a model for the in-control process and using dummy variables to determine the nature of the change. Mason et al. (1995) decompose T2 into independent components, each reflecting an individual variable Xj. Kourti and MacGregor (1996) show that the T2 reduction works well when the dimension is not too high.

Wikström et al. (1998) use principal component analysis to derive the most important principal components and then univariate CUSUM and univariate EWMA is applied to

the first principal component.

Abu-Shawiesh and Abdullah (2001) study surveillance of two correlated processes. The T2 statistic is based on robust estimates of location and scale and the situation is one where both scale and location change. This approach is compared to the ordinary T2. Mason et al. (2003) find that AR processes result in a U shaped T2 curve and if the data contains trend, the T2 values will exhibit a trend.

Another reduction is to use the minimum and maximum values at each time, as is done in Sepúlveda and Nachlas (1997). Charnes (1995) studies the situation where the processes under surveillance are time dependent and where the residuals (or forecasting errors) are used in the monitoring. Kang and Albin (2000) model the variable Y as a function of the variable X and monitor the slope and intercept by T2 statistic as well as the residuals (deviations from reference line).

Aparisi et al. (2001) reduce the covariance matrix through the determinant or the trace. This scalar is monitored, using univariate Shewhart and univariate EWMA and CUSUM. Guerrero-Cusumano (1995) uses an entropy measure instead of the determinant.

Stoumbos and Jones (2000) reduce the multivariate data to a probability measure which shows how “central” the observation is.

Lu et al. (1998) study correlated variables and the proportion of non-conforming units. The multivariate data are reduced to a scalar by weighting together the non-conforming units and then monitored by univariate Shewhart.

Cheng and Liu (2000) use a rank measure for how outlying an observation is and then the univariate rank variable is monitored using a Shewhart approach.

(9)

Qiu and Hawkins (2001) use the anti-ranks of the p observations at each time and summarizes the anti ranks in a statistic which is monitored using univariate CUSUM. 1.3.2 MEWMA and MCUSUM

From the univariate specification of the EWMA and CUSUM methods, several multivariate variants have been developed.

Bodden and Rigdon (1999) use a multivariate EWMA and smooth all p processes using the same constant, λ, and then the vector of the smoothed values is reduced by the T2 statistic. This approach is also used in Love and Linderman (2003) and Molnau et al. (2001) and by Stoumbos and Sullivan (2002), who show that if the smoothing constant equals 1 (no smoothing) and the process is not normally distributed, then the ARL0 is over estimated if the alarm limits are determined under normality assumptions. But for a small smoothing constant, the ARL0 is not so biased, if the normality assumption is violated. A small smoothing constant gives approximately equal weight to all observations, which is approximately normal according to the central limit theorem. Lowry et al. (1992) use EWMA smoothing with separate λ values, and then reduction by the T2 statistic. The λ values are determined so as to minimize the ARL1. Yumin (1996) suggests that if the p X processes are correlated, they should be transformed into principal components, which are then smoothed separately. Runger et al. (1999) reduce the dimension by a transform similar to principal component analysis, then the transformations are smoothed using the same λ and then the T2 statistic is calculated from the smoothed series. Gan (1997) constructs a control chart with the smoothed variance on one axis and the smoothed mean on the other, called a combined EWMA chart. The advantage, to a T2 reduction, is that the chart shows whether it is the mean or the variance that is out-of control.

A similar approach is used in Woodall and Ncube (1985), but with a univariate CUSUM for each of the p processes. Hawkins (1991) applied a univariate CUSUM to a linear combination of the variables {X1, X2, ..., Xp}. Wessman (1998) showed that if the p processes have identical change times, then data can, without loss of information, be reduced to a univariate index.

Crosier (1988) and Pignatiello and Runger (1990) used an MCUSUM on the same form as as the univariate CUSUM, only with matrixes instead of scalars.

2. Model and methods

In this paper we study the situation when two processes may change at different time points, i.e. τ1 and τ2 may have different values. The τ values can be dependent.

(10)

At decision time s, the observations (X, Y) are modeled according to (1) (2) ... ( ) m m m s ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ = (1) (1) (2) (2) ... ... ( ) ( ) x y x y x s y s ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ = (1) (1) (2) (2) ... ... ( ) ( ) X Y X Y X s Y s μ μ μ μ μ μ ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ + (1) (1) (2) (2) ... ... ( ) ( ) X Y X Y X s Y s ε ε ε ε ε ε ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ (4)

The stochastic term at time t,

(

εX( )t εY( )t

)

, follows a bivariate normal distribution,

with expected value zero and covariance matrix

2 1 1 ρ σ ρ ⎛ ⎞ ⎜ ⎟ ⎝ ⎠, (5)

where ρ=correlation between X(t) and Y(t).

The variables X(t) and Y(t) are (possibly) dependent but not X(t) and X(t-j), Y(t) and Y(t-j) or X(t) and Y(t-j). The aim is to detect the first change in either of the μ vectors (μX, μY), where one application is detection of the start of an influenza epidemic. The SMI receives weekly information about confirmed cases (laboratory diagnosed, see Figure 1) and suspected cases. The development is roughly captured through the following parametric model for μ, which will be used for both X and Y in the investigation below. μ(t): D 01 Cτ 3 02 1 2 μ (t): β , t τ μ (t): β β (t+1-τ) β (t+1-τ) , t τ ⎧ < ⎨ + ⋅ + ⋅ ≥ ⎩ (6)

where β1 < 0, β2 > 0. Thus, for τX=τY, X and Y have the same distribution and they are independent (or dependent) conditional on τ. The model above might be too crude in a real situation, but is used here to demonstrate the inferential aspects.

Time 8 7 6 5 4 3 2 1 0 μ τ 5 3 1 Infinity

Figure 1: Left: The number of confirmed cases of influenza in Sweden for three con-sequtive seasons (source: SMI). Right: The model for μ (for different starting times).

As mentioned above, a bivariate Geometric distribution is used for (τX, τY), with parameters (ν01, ν10, ν11). The marginal distributions are τX∼Geo(ν1.=ν10+ν11) and

(11)

(ν1.=ν.1=0.10). The value of ν11 will be varied, thereby yielding different correlations, here {0, 0.80}. The correlation between τX and τY is from now denoted ψ.

Three methods will be compared, corresponding to the approaches of reduction to a scalar, parallel surveillance and simultaneous solution.

2.1 The T2 method in surveillance

An early multivariate method is the T2 method of Hotelling (1947). The statistic, (x-μD)’ Σ-1 (x-μD), is here assumed to have a known covariance matrix, see e.g. Alt (1985). The T2 is an example of a reduction. We assume that X and Y have the same distribution, conditional on τX and τY and thus an alarm is given when

T2(s) = 2 2 2 2 2 2 2 2 ( ( ) ( )) ( ( ) ( )) 2 ( ( ) ( ))( ( ) ( )) (1 ) (1 ) (1 ) D D D D x s μ s y s μ s ρ x s μ s y s μ s ρ σ ρ σ ρ σ − + − − − − − >k.

2.2 Parallel likelihood ratio systems

The parallel surveillance system in this paper is based on separate likelihood ratio methods for X and Y. This system is hence forward denoted LRpar. For the process X we have the following likelihood ratio alarm statistic

LRX(s) = 2 2 1 1 2 2 1 ( ( ) ( )) ( ( ) ( )) ( ) ( ) exp ( ) 2 2 ( ) s s D Cj s t t X j X x t t x t t P s P j k P s P s μ μ τ τ τ = σ = σ τ = ⎛ ⎞ − − ⎜ ⎟ > = − > ≤ ⎜ ⎟ ≤ ⎜ ⎟ ⎝ ⎠

where μD and μC are defined in (6) and τ is assumed to follow a geometric distribution. The LRY(s) statistic is defined correspondingly. The time of alarm for X is defined as tAX = min[LRX(s) > kXs]

where kXs =k⋅(PτX>s)/P(τX≤s), and correspondingly for Y. The time of alarm for the LRpar system is the first time for which either alarm system gives an alarm, i.e.

tA=min{tAX, tAY}.

If X and Y are independent, the distribution of tA is a direct function of the distributions of tAX and tAY.

(12)

2.3 Simultaneous solution using the joint likelihood ratio

The optimal surveillance method for C={τ(1)≤s} was derived in Section 1.2.3 and was shown to consist of s weighted partial likelihood ratios {L(s,1), ..., L(s,s)}, such that

wjs⋅L(s, j)= (1) (1) (1) (1) ( ) ( ) ( ) ( ) s s f m j P j P s f m s τ τ τ τ = = ⋅ ≤ > .

The event {τ(1) = j} consists of the following sub-events: {τX =j ∩ τY >j}, event C-D, “a turn in X at time j” {τX >j ∩ τY =j }, event D-C, “a turn in Y at time j”

{τX =j ∩ τY =j}, event C-C, “turn in both X and Y at time j”. Thus, wjs⋅L(s,j) is weighted together as

j0 ( , , ) 0j ( , , ) jj ( , , ) s s s wL s j CD +wL s j D C− +wL s j CC = 0 ( , ) 0 ( , ) ( , ) ( , ) ( , ) ( , ) s X Y s X Y s X Y j j jj s s s s X Y s X Y s X Y f m j j f m j j f m j j w w w f m j j f m j j f m j j τ τ τ τ τ τ τ τ τ τ τ τ = > > = = = ⋅ + ⋅ + ⋅ > > > > > >

where the weights are

0 (1) ( ) ( ) j X Y s P j j w P s τ τ τ = ∩ > = ≤ , 0 (1) ( ) ( ) j X Y s P j j w P s τ τ τ > ∩ = = ≤ , 1 (1) 2 ( ) ( ) jj s P j j w P s τ τ τ = ∩ = = ≤ .

Thus, the information about the bivariate distribution of (τX, τY) is included through the weights and the alarm limit, k’ P(τ(1)>s)/P(τ(1)≤s).

The complete optimal LR statistic at decision time s consists of 3⋅s components

0 0 1 ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) s s X Y s X Y s X Y j j jj s s s j s X Y s X Y s X Y f m j j f m j j f m j j w w w f m j j f m j j f m j j τ τ τ τ τ τ τ τ τ τ τ τ = ⎡ = > > = = = ⎤ + + ⎢ > > > > > > ⎥ ⎢ ⎥ ⎣ ⎦

.

For example, the event {τX =j, τY >j} means that there is a change in μX at time j, but no change yet in μY. Then the L(s,j) equals

0 ( , , ) ( , , ) Cj D s s X Y j s D D s s X Y f x y w f x y μ μ μ μ μ μ μ μ = = = = + 0 ( , , ) ( , , ) D Cj s s X Y j s D D s s X Y f x y w f x y μ μ μ μ μ μ μ μ = = = = + ( , , ) ( , , ) Cj Cj s s X Y jj s D D s s X Y f x y w f x y μ μ μ μ μ μ μ μ = = = = ,

where f(xs, ys) is the bivariate normal distribution with expected values as in (4) and covariance matrix as in (5). Because of independence over time, we have e.g.

(13)

For example, the denominator, ( , X D, Y D) s s f x y μ =μ μ =μ , is 2 2 1 2 1 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) exp 2 D D D D s t X X Y Y x t t x t t y t t y t t c c μ ρ μ μ μ σ σ σ σ = ⎛ ⎡ ⎞⎛ ⎞ ⎛ ⎤⎞ ⎜ ⎢ ⎥⎟ ⋅ − ⋅ ⎟⎜ ⎟ ⎜+ ⎠⎝ ⎠ ⎝ ⎟ ⎝

⎠ where 2 2 1 2 1 1 , 2 (1 ) 2 1 X Y c c ρ πσ σ ρ = = ⋅ − .

This system is an example of a simultaneous solution where the joint distribution is used, both for (τX, τY) and for (X,Y). The system is hence forward denoted the LRjoint. If X and Y are independent, the L(s, j) simplifies to

0 ( ) ( ) s X j s s X f x j w f x s τ τ = > + 0 ( ) ( ) s Y j s s Y f y j w f y s τ τ = > + ( ) ( ) ( ) ( ) s X s Y jj s s X s Y f x j f y j w f x s f y s τ τ τ τ = = ⋅ > > .

Thus, for independent X and Y, the components L(s,j,C-D) and L(s,j, D-C) are the same components that are used in the LRpar.

2.4 Alarm regions for the methods

The methods T2, LRpar and LRjoint are made comparable by adjusting their respective alarm limits so that they all have the false alarm probability 0.10, defined as

PFA= (1) 1 ( A ) ( ) i P t i Pτ i ∞ = < ⋅ =

. (3)

The alarm regions for the three methods are illustrated in Figure 2, for different values of ψ (correlation between the change times) and different values of ρ (dependency between X and Y). LR(xs) LR(ys) 1550 1350 1150 950 750 550 350 150 -50 -250 -450 -650 ψ=0.8 ρ=0.5ψ=0 LR(xs) LR(ys) 2000 1500 1000 500 0 -500 -1000 ψ=0.8 ψ=0 ρ=0.5 x(s) y( s) 4 3 2 1 0 -1 -2 -3 -4 ψ=0 ψ=0.8 ρ=0.5

Figure 2: Alarm limit is marked by open square for {ψ=0, ρ=0}, filled square for {ψ=0.8, ρ=0}, star for {ρ=0.5, ψ=0 }. Left: LRpar, middle: LRjoint, right: T2.

(14)

joint LR statistic is high, thus if either X or Y or (X+Y) deviates from the D-state. The same holds for T2.

For LRpar, the shape of the alarm region (square) is unaffected by both ψ and ρ. For LRjoint, the shape changes in one direction when ψ increases and in another direction when ρ increases. The T2 method has a circle shaped alarm region for independent X and Y, whose size changes with ψ. But when ρ increases, the shape of the alarm region becomes more tilted and oval.

2.5 Evaluation in multivariate surveillance

The aim is to make inference about the first change, τ(1), and the false alarms are controlled by PFA (see (3)). The performance in surveillance can be measured by the delay of a motivated alarm, e.g. by the CED in Section 1.1. In multivariate surveillance, the delay depends on both change times, τX and τY, and the CED can be defined as

CED(t1, t2) = E[tA-τ(1)⏐ tA≥τ(1), τX=t1, τY =t2]. (7)

3. Results on false alarms

The methods T2, LRpar and LRjoint are compared for different types of dependencies. The methods are comparable by all having PFA=0.1, where PFA is summarized using the bivariate Geometric distribution that results in a specified ψ (correlation between τX and τY). In LRpar, the intensity ν=0.1 is used. LRjoint includes the intensity parameters ν1.=0.1, ν.1=0.1 and ν11, where ν11 takes different values for different ψ’s.

Table 1: MRL0 (median run length), for PFA=0.10

ρ(X,Y) = 0 ρ(X,Y) = 0.5

ψ(τX, τY) = 0 ψ(τX, τY) = 0.8 ψ(τX, τY) = 0

LRjoint 14 (ν11=0.01) 36 (ν11=0.0895) 14 (ν11=0.01)

LRpar 14 34 14

T2 26 50 26

For a specific situation in Table 1, LRpar and LRjoint have similar MRL0, whereas T2 has a higher median, because of the many early alarms (see Figure 3).

For all methods, MRL0 is almost independent on ρ (dependency between X and Y), but very dependent on ψ (correlation between τX and τY). The reason is that the density of τ(1) is used to weight together the alarm times. As ψ increases towards 1.00, the alarms tend to be more uniformly distributed and MRL0 is longer (see Figure 3).

(15)

t 30 20 10 0 F a

lse alarm densit

y .07 .06 .05 .04 .03 .02 .01 0.00 T2 LRjoint LRpar t 30 20 10 0 F a

lse alarm densit

y .07 .06 .05 .04 .03 .02 .01 0.00 T2 LRjoint LRpar

Figure 3: False alarm density, P(tA=t). Left: ψ=0, right: ψ= 0.8.

t 40 30 20 10 0 Dens ity for tau(1) .3 .2 .2 .1 .1 0.0 ψ = 0.8 0

Figure 4: Probability density for τ(1), when ψ={0,0.8}.

4. Using information about correlation between change times

Situations where X and Y may change at different time points, τX and τY, are considered. The delay of a motivated alarm, in relation to the first change point, is used to compare the methods (CED(t1,t2) in (7)). The delay is studied for ψ={0, 0.8}, i.e. when we assume that the changes occur independently and when they tend to occur more simultaneously.

4.1 Comparison between the three methods

The effect of assumptions regarding the correlation between τX and τY is investigated. The LRjoint is the only method (of these three) that uses the information of ψ in the alarm statistic, by different weights for the three components (see Section 2.3).

(16)

t2 20 15 10 5 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 14 10 1 t2 20 15 10 5 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 14 10 1 t2 20 15 10 5 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 14 10 1

Figure 5: CED(t1, t2), for t1={1, 10, 14}, where the pattern is the same for CED(10, t2) and CED(14, t2). This is called steady curve. Left: LRpar, middle: LRjoint, right: T2.

4.1.1 First change occurs immediately

The CED(t1,t2) is compared between the methods for ψ={0, 0.8}, when the first change occurs immediately, i.e. τ(1)=1.

t2 16 14 12 10 8 6 4 2 0 CED(t1=1, t2) 4.8 4.4 4.0 3.6 3.2 2.8 T2 LRjoint LRpar t2 16 14 12 10 8 6 4 2 0 CED(t1=1, t2) 4.8 4.4 4.0 3.6 3.2 2.8 T2 LRjoint LRpar

Figure 6: CED(t1,t2) when t1=1 and t2={1, 2, ..., 14}. Left: ψ=0, right: ψ=0.8.

For all methods the delay is shortest when both changes occur immediately (τX=τY=1), independent of ψ.

When independent change times are assumed (ψ=0), the methods give similar delay. For LRjoint and LRpar, this is also indicated by the similar alarm regions in Figure 2. The exception is at (τX = τY =1) where LRjoint has slightly shorter delay. LRjoint consists of three components (C-D, D-C, C-C, see section 2.3), whereas LRpar is based on two components (C-D and D-C). For ψ=0, T2 has the shortest delay.

When a positive correlation is assumed between the change times (ψ=0.8), LRjoint is superior at (τX = τY =1). LRjoint assigns a large weight to the C-C component and when the changes actually occur simultaneously, LRjoint has a short delay.

(17)

In Figure 7, the CED(t1, t2) of the steady curve is compared between the methods for ψ={0, 0.8}.

When one change occurs late (here t1=10), the methods are more different than for t1=1. For ψ=0, T2 has the longest CED, except at (t1=10, t2=1). The CED curves for LRpar and LRjoint are rather similar.

Also for ψ=0.8, T2 has longer CED than the two LR methods. For ψ>0, the difference is larger between LRpar and LRjoint: at simultaneous change times (τX = τY), LRjoint has shortest delay, but for τX≠τY, LRpar is a little better. LRjoint assigns a large weight to the C-C-component, which is an advantage when τX=τY, but not when τX≠τY.

t2 16 14 12 10 8 6 4 2 0 CED(t1, t2) 4.8 4.4 4.0 3.6 3.2 2.8 T2 LRjoint LRpar t2 16 14 12 10 8 6 4 2 0 CED(t1, t2) 4.8 4.4 4.0 3.6 3.2 2.8 T2 LRjoint LRpar

Figure 7: CED(t1,t2) when t1=10 and t2= {1, 2, ..., 14}. Left: ψ=0, right: ψ=0.8. 4.2 Effect of change times

As mentioned above, the delay depends on the change times, τX and τY. The effect of assumptions about the correlation between τX and τY is studied for each method.

t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1 t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1

(18)

t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1 t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1

Figure 8b: CED(t1,t2) for LRjoint when t1={1, 5, 10}. Left: ψ=0, right: ψ=0.8.

t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1 t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1

Figure 8c: CED(t1,t2) for T2 when t1={1, 5, 10}. Left: ψ=0, right: ψ=0.8.

From Figure 8 we conclude that for each value of t1, the CED(t1,t2) curve reaches an asymptote after a while, CED(t1,∞). For T2, this asymptote is independent of t1, whereas for LRpar and LRjoint it depends on t1, for small values of t1.

For all methods, CED is longer when ψ>0. For LRpar, the CED-curve, for each value of t1, is higher as a result of the higher alarm limit. The same holds for T2. Also for LRjoint, the CED values are higher, but there is also a larger difference between CED(t,t) and CED(t,∞). As discussed above, for ψ>0 the LRjoint works well when the changes do actually occur simultaneously.

(19)

5. Processes under surveillance are dependent

The methods are compared when the processes X and Y are independent or have a positive covariance, ρ={0, 0.5}. No correlation is assumed between the change times (ψ=0).

5.1 Comparison between the three methods

For T2 and LRjoint, the alarm statistic does incorporate the information of ρ, whereas the alarm statistic of LRpar is the same, independent of ρ.

5.1.1 First change occurs immediately

The CED curves, when τ(1)=1, are compared between the methods, for ρ={0, 0.5}. For immediate changes in both processes (τX =τY =1), the T2 has the shortest CED and LRpar the longest, when X and Y are independent (ρ=0). However, for ρ=0.5, LRpar has shortest delay and LRjoint longest. One reason for the long delay of LRjoint is the assumption that ψ=0. Also, Wessman (1999) investigated multivariate surveillance for the situation C={τ(1)=s} and showed that if changes occur in all processes, the probability of detecting it is lower if the observations are highly correlated. This agrees with the long CED for T2 and LRjoint for τX =τY =1 (see section 5.2.1).

t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 T2 LRjoint LRpar t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 T2 LRjoint LRpar

Figure 9: CED(t1,t2) when t1=1 and t2= {1, 2, ..., 10}. Left: ρ=0, right: ρ=0.5.

5.1.2 Steady curve

The methods are compared for ρ={0, 0.5}, using the steady delay curve.

(20)

For ρ=0, the CED(t,t) is much smaller than CED(t,∞) for all three methods. But for ρ=0.5, when we use LRjoint or T2, the CED(t,∞) is only just smaller than CED(t,t), since here CED(t,t) is comparatively long.

t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 T2 LRjoint LRpar t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 T2 LRjoint LRpar

Figure 10: CED(t1,t2) when t1=10 and t2= {1, 2, ..., 14}. Left: ρ=0, right: ρ=0.5.

5.2 Effect of change times

The effect of the covariance between X and Y is studied, ρ={0, 0.5}, for each of the three methods. For LRjoint and T2, the CED(t,t) is shorter than CED(t,⏐t-1⏐), but the difference is small for ρ=0.5, compared to ρ=0. Both T2 and LRjoint do include the components of the Mahalanobis distance, namely (x-μD), (y-μD) and (-2ρ(x-μD)(y-μD)). Wessman (1999) pointed out that, when ρ>0.5, the Mahalanobis distance is smaller for unit shifts in both processes than for a shift in only one process. For LRjoint and T2, the CED(t,∞) is shorter when ρ>0, indicating that far apart changes are easy to detect for methods that incorporate ρ.

t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1 t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1

(21)

t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1 t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1

Figure 11b: CED(t1,t2) for LRjoint when t1={1, 5, 10}. Left: ρ=0.0, right: ρ=0.0.

t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1 t2 16 14 12 10 8 6 4 2 0 CED(t1,t2) 4.8 4.4 4.0 3.6 3.2 2.8 t1 10 5 1

Figure 11c: CED(t1,t2) for T2 when t1={1, 5, 10}. Left: ρ=0.0, right: ρ=0.0.

A positive covariance between the processes (X and Y) does not affect the CED of LRpar to any large extent.

For LRjoint, the immediate change situation, τX=τY=1, is quicker detected if the processes are independent. For positive ρ-values X and Y tend to be alike, thus independent processes contain more information. Also for T2, the (τX =τY =1) situation is easier to detect if the processes are independent.

For late first changes (τ(1)≥5), the difference in CED(t, ∞) between LRpar and LRjoint is larger when ρ=0.5, compared to ρ=0.

The steady curve of T2 only dependens on the distance (t1-t2). This was proven generally in Andersson (2005).

5.2.1 The delay for close changes

(22)

that the situation that produces an alarm differs between ρ=0 and ρ=0.5, for LRjoint and T2.

For LRjoint, one reason for the long delay for (τX =τY) is the assumption of ψ=0, which results in a low weight for the C-C component. However, the result with long delay for close changes is present also for T2, which does not use information about ψ. For both LRjoint and T2, Figure 2 demonstrates that, when ρ=0, an alarm is called when either X’=(X-μD) or Y’=(Y-μD) or (X’+Y’) is large. The sum of two variables that changed simultaneously (from 0 to a positive value μ) is stochastically larger than the sum of two variables with different change times (and thus different expected values, at least at one time point). This can be illustrated by the probability calculations below. P(X’+Y’ > k⏐μXY=μ) = P Z( >(k(2 )) / 2μ σ2 , for τ

X =τY, P(X’+Y’ > k⏐μX=0, μY=μ)=P Z{ >(k−( )) / 2μ σ2}, for τX ≠ τY, where the probability is larger for τX =τY, i.e. P(Z>k-2μ) > P(Z> k-μ).

When ρ>0, Figure 2 indicate that an alarm is called when the absolute difference, ⏐X’-Y’⏐, is large enough. The difference between two variables that change simultaneously (from 0 to a positive value μ) is stochastically smaller than the difference between variables with different change times, illustrated by

P(X’-Y’ > k⏐μXY=μ) = { ( (2 )) / 2 2(1 )}

P Z > k− μ σ −ρ , for τX =τY,

P(X’-Y’ > k⏐μX=μ, μY=0)= { ( ( )) / 2 2(1 )}

P Z> k− μ σ −ρ , for τX ≠τY,

where the probability is larger for τX ≠τY, i.e. P(Z>k-μ) > P(Z>k-2μ).

However, for “moderate” covariance, ρ=0.5, the long CED for simultaneous changes does not always hold. E.g in Figure 11b-11c, we have CED(1,1) < CED(1,2). The CED also depends on the model for μD and μC. The Mahalanobis distance is proportional to

(

( ) D

) (

2 ( ) D

)

2 2

(

( ) D

)(

( ) D

)

x t −μ + y t −μ − ρ x t −μ y t −μ ,

where E[X(t)] and E[Y(t)] depend on τX and τY. We compare the expected Mahalanobis distance, here denoted EM, for two situations: (τX =τY =1) and (τX =1, τY =2).

EM(1,1)=

(

μC1−μD

)

2⋅ −

(

1 2ρ

)

, when τ

X =τY =1,

EM(1,2)=

(

μC1−μD

) (

2+ μC2−μD

)

2−2ρ μ

(

C1−μD

)(

μC2−μD

)

, when τ

(23)

6. Discussion

Warning systems are used in many areas: public health, bio terrorism, radiation, pregnancy, intensive care patients. Very often, several processes can be used to detect an underlying change, e.g. confirmed and suspected cases of a disease. Then we need a warning (or surveillance) system for multivariate data. When monitoring more than one process, we must consider the dependency between the processes, the correlation between the change times and the possible differences between the processes.

In this paper, the effect of different types of dependencies is investigated for three methods of multivariate surveillance. One method uses the Hotellings T2, where the multivariate data is reduced to a scalar at each time point. The second method is a parallel system, where the likelihood ratio method is applied to each marginal process, here called the LRpar system. The third method is optimal according to the Shiryaev criterion. It is a simultaneous solution, derived from the joint likelihood ratio, here called LRjoint.

The alarm limits of each of the three systems are adjusted to yield the same false alarm probability for some different situations. One situation is where we assume that the changes occur independent (i.e. change times τX and τY are uncorrelated) and also that the processes (X and Y) are independent, conditional on τX and τY. In another situation, we assume a positive correlation between τX and τY (ψ>0, changes tend to occur more simultaneously). In a third situation there is a positive covariance between the monitored processes X and Y, conditional on τX and τY (ρ>0). The evaluation is made using the delay of an alarm, in relation to the first change time.

The methods are first compared for independent processes and no correlation between the change times (ρ=0, ψ=0). For immediate changes in both processes (τX=τY=1), T2 has the shortest delay, followed by LRjoint and then LRpar. T2 allocates the alarms early, whereas the two LR methods have very few early alarms. LRjoint has a slightly shorter delay than LRpar, since LRjoint is based on three components, corresponding to a change in either X or Y or both, whereas LRpar is based on only two components: change in either X or Y and thus not optimal for simultaneous changes. However, the difference is very small since ψ=0 (we do not expect simultaneous changes to be too frequent). Also for later simultaneous changes, LRjoint is slightly better than LRpar, whereas T2 here yields a long delay (an effect of allocating the alarms early). Thus, even if T2 is a reduction, which is sufficient at simultaneous change times, the method is not nessecarily optimal here. In univariate surviellance, the Shewhart method has a constant CED(t) over all values of t, and the corresponding holds for T2 (constant CED(t,t) over all t). However, for τ>1, Shewhart does not have the shortest delay, and neither has T2. Both Shewhart (univariate) and T2 (multivariate) allocate many alarms early and few at later time points, thus producing longer delay for later changes. For (ρ=0, ψ=0), when we consider different change times, LRpar has the same delay as LRjoint and T2 has the longest delay, because of the alarm allocation.

(24)

delay. T2, as for ψ=0, only works well for early simultaneous changes and for late simultaneous changes, LRpar has shorter delay than T2. For (ρ=0, ψ=0.8), when we have different change times, LRpar gives slightly shorter delay than LRjoint, and T2 gives the longest delay. LRjoint uses the information that ψ>0, which is a disadvantage when the changes actually do occur at different times.

Next, the three methods are compared for the situation when X and Y have a positive covariance, but the change times are uncorrelated (ρ=0.5, ψ=0). For immediate changes in both processes, τX=τY=1, LRpar has the shortest delay, followed by T2 and then LRjoint. For late simultaneous changes, LRpar gives shorter delay than LRjoint (although LRjoint now works better than T2). One explanation to the long delay for LRjoint is the assumption that τX and τY are independent (ψ=0). If we used the assumption ψ>0, LRjoint would perform better at the simultaneous changes. However, the long delay for close changes is present also for T2, which does not use information about ψ. The alarm statistics of LRjoint and T2 both include ρ. When ρ>0, an alarm is called when the absolute difference between (X-μD) and (Y-μD) is large, both for LRjoint and for T2. The probability that this difference is large, is lower when τX =τY. Both the LRjoint statistic and the T2 statistic includes the Mahalanobis function, thus when the components (x-μD) and (y-μD) both are large and ρ is large, then (-2ρ(x-μD)(y-μD)) is also large and therefore the alarm statistic produces a small value if ρ is large. Heuristically, we may say that when X and Y have a positive covariance, they give similar information, especially if they have the same distribution (τX=τY) and this may explain the long delay. For (ρ=0.5, ψ=0), when we have different change times, LRjoint gives a shorter delay than LRpar and T2 yields the longest delay.

Generally, the difference in delay is large between LRjoint and T2, in so that LRjoint gives shorter delay than T2, except at early, simultaneous changes. The difference between LRjoint and LRpar is smaller than between LRjoint and T2.

If all processes change immediately, T2 gives shortest delay, in case of total independence (ρ=0, ψ=0), whereas LRjoint is best if there is a positive correlation between the change times (ρ=0, ψ>0). If there is a positive covariance between the processes and no correlation between the change times (ρ>0, ψ=0), then LRpar gives the shortest delay for τX=τY=1. For later simultaneous changes, LRjoint is best, for all non-negative correlations, when the processes are independent (ψ≥0, ρ=0). For processes with a positive covariance, LRpar yields the shortest delay for later simultaneous changes, at least when no correlation is assumed between the change times (ρ>0, ψ=0).

If the processes change at different time points, LRjoint gives the shortest delay, except for a positive correlation between the change times (ψ>0). This holds for both independent processes and when the covariance is positive (ρ≥0).

In this paper we only deal with positive dependency. A negative correlation between the change times, ψ<0, implies that the change times do not coincide. Then it would be best to have a very small weight for the component for simultaneous changes. If the processes themselves have a negative covariance (ρ<0), the implication is that X= -Y. Then the alarm region would constitute of large values of (X+Y) and the alarm region for

(25)

Appendix

The Mij(t) is based on two distances, namely di(t)=(μCi(t)-μD)/σ and dj(t)=(μCj(t)-μD)/σ.

, we compare M1,1(t) and M1,2(t) (below the time index t is dropped): M1,1= 2 2 1 1 1 1 2 2 (1 ) d d ρd d ρ + − − and M1,2= 2 2 1 2 1 2 2 2 (1 ) d d ρd d ρ + − − .

If the inequality M1,2 > M1,1 holds, this does indicate that CED(1,1)<CED(1,2). The

inequality between the M-values holds when 2 2

1 2 1 1 2

(26)

References

Abu-Shawiesh, M. and Abdullah, M. (2001) A new robust bivariate control chart for location. Communications in Statistics-Simulation and Computation, 30, 513-530. Alt, F. B. (1985) Multivariate quality control, Wiley.

Andersson, E. (2004) The impact of intensity in surveillance of cyclical processes.

Communications in Statistics-Simulation and Computation, 33, 889-913.

Andersson, E. (2005) The T2 method in multivariate on-line surveillance. The delay of an alarm. Working paper, Statistical Research Unit, Göteborg University, Sweden. Andersson, E., Bock, D. and Frisén, M. (2004) Detection of turning points in business

cycles. Journal of Business Cycle Measurement and Analysis, 1, 93-108.

Aparisi, F., Jabaloyes, J. and Carrión, A. (2001) Generalized variance chart design with adaptive sample sizes. The bivariate case. Communications in

Statistics-Simulation and Computation, 30, 931-949.

Baron, M. (2002) Bayes and asymptotically pointwise optimal stopping rules for the detection of influenza epidemics. In Case Studies in Bayesian Statistics, Vol. 6 (Eds, Gatsonis, C., Kass, R. E., Carriquiry, A., Gelman, A., Higdon, D., Pauler, D. K. and Verdinelli, I.) Springer-Verlag, New York, pp. 153-163.

Bock, D. (2003) Similarities and differences between statistical surveillance and certain decision rules in finance. Research report 2003:4, Statistical Research Unit, Göteborg University, Sweden.

Bodden, K. M. and Rigdon, S. E. (1999) A program for approximating the in-control ARL for the MEWMA chart. Journal of Quality Technology, 31, 120-123. Charnes, J. M. (1995) Tests for special causes with multivariate autocorrelated data.

Computers & Operations Research, 22, 443-453.

Cheng, A. Y. and Liu, R. Y. (2000) Monitoring multivariate aviation safety data by data depth: Control charts and threshold systems. IIE Transactions, 32, 861-872. Crosier, R. B. (1988) Multivariat Generalizations of Cumulative Sum Quality-Control

Schemes. Technometrics, 30, 291-303.

Does, R. J. M. M., Roes, K. C. B. and Trip, A. (1999) Handling multivariate problems with univariate control charts. Journal of Chemometrics, 13, 353-369.

Frisén, M. (2003) Statistical surveillance. Optimality and methods. International

Statistical Review, 71, 403-434.

Frisén, M. and de Maré, J. (1991) Optimal Surveillance. Biometrika, 78, 271-80. Frisén, M. and Sonesson, C. (2002) Optimal surveillance based on exponentially

weighted moving average. Research report 2002:1, Department of Statistics, Göteborg University, Sweden.

Frisén, M. and Wessman, P. (1999) Evaluations of likelihood ratio methods for surveillance. Differences and robustness. Communications in Statistics.

Simulations and Computations, 28, 597-622.

Gan, F. (1993) An optimal-design of EWMA control charts based on median run-length.

Journal of Statistical Computation and Simulation, 45, 169-184.

(27)

Guerrero-Cusumano, J.-L. (1995) Testing variability in multivariate quality control: A conditional entropy measure approach. Information Sciences, 86, 179-202. Hamilton, J. D. (1989) A new approach to the economic analysis of nonstationary time

series and the business cycle. Econometrica, 57, 357-384.

Hawkins, D. M. (1991) Multivariate Quality Control Based on Regression-Adjusted Variables. Technometrics, 33, 61-75.

Hotelling, H. (1947) Multivariate Quality Control. In Techniques of statistical

analysis(Eds, Eisenhart , C., Hastay, M. W. and Wallis, W. A.) McGraw-Hill,

New York.

Javaheri, A. and Houshmand, A. A. (2001) Average run length comparison of

multivariate control charts. Journal of Statistical Computation and Simulation, 69, 125-140.

Jolayemi, J. K. (2000) A model for the statistical design of multivariate control charts with multiple control regions. Applied Mathematics and Computation, 109, 73-91. Järpe, E. (1999) Surveillance of spatial patterns. Communications in Statistics. Theory

and Methods, 28, 3009-3025.

Kalagonda, A. A. and Kulkarni, S. R. (2003) Diagnosis of Multivariate Control Chart Signal Based on Dummy Variable Regression Technique. Communications in

Statistics-Theory and Methods, 32, 1665-1684.

Kang, L. and Albin, S. L. (2000) On-line monitoring when the process yields a linear profile. Journal of Quality Technology, 32, 418-426.

Koskinen, L. and Öller, L.-E. (2004) A Classifying Procedure for Signalling Turning Points. Journal of Forecasting, 23, 197-214.

Kourti, T. and MacGregor, J. F. (1996) Multivariate SPC methods for process and product monitoring. Journal of Quality Technology, 28, 409-428.

Love, T. E. and Linderman, K. (2003) A Weibull process failure mechanism for the economic design of MEWMA control charts. Journal of Statistical Computation

and Simulation, 73, 195-202.

Lowry, C. A., Woodall, W. H., Champ, C. W. and Rigdon, S. E. (1992) A multivariate exponentially weighted moving average control chart. Technometrics, 34, 46-53. Lu, X. S., Xie, M., Goh, T. N. and Lai, C. D. (1998) Control chart for multivariate

attribute processes. International Journal of Production Research, 36, 3477-3489. Marshall, A. W. and Olkin, I. (1997) A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika,

84, 641-652.

Mason, R. L., Chou, Y. M., Sullivan, J. H., Stoumbos, Z. G. and Young, J. C. (2003) Systematic patterns in T-2 charts. Journal of Quality Technology, 35, 47-58. Mason, R. L., Tracy, N. D. and Young, J. C. (1995) Decomposition of T2 for multivariate

control chart interpretation. Journal of Quality Technology, 27, 99-108. Molnau, W. E., Runger, G. C., Montgomery, D. C., Skinner, K. R., Loredo, E. N. and

Prabhu, S. S. (2001) A program for ARL calculation for multivariate EWMA charts. Journal of Quality Technology, 33, 515-521.

Neftci, S. (1982) Optimal prediction of cyclical downturns. Journal of Economic

(28)

Petzold, M., Sonesson, C., Bergman, E. and Kieler, H. (2004) Surveillance in

longitudinal models. Detection of intra-uterine growth restriction. Biometrics, 60, 1025-1033.

Pignatiello, J. J. and Runger, G. C. (1990) Comparisons of Multivariate CUSUM Charts.

Journal of Quality Technology, 22, 173-186.

Qiu, P. H. and Hawkins, D. (2001) A rank-based multivariate CUSUM procedure.

Technometrics, 43, 120-132.

Roberts, S. W. (1959) Control Chart Tests Based on Geometric Moving Averages.

Technometrics, 1, 239-250.

Royston, P. (1991) Identifying the fertile phase of the human menstrual cycle. Statistics

in Medicine, 10, 221-240.

Runger, G. C., Keats, J. B., Montgomery, D. C. and Scranton, R. D. (1999) Improving the performance of the multivariate exponentially weighted moving average control chart. Quality and Reliability Engineering Internato, 15, 161-166. Sepúlveda, A. and Nachlas, J. A. (1997) A simulation approach to multivariate quality

control. Computers & Industrial Engineering, 33, 113-116.

Serel, D. A., Moskowitz, H. and Tang, J. (2000) Univariate (X)over-bar control charts for individual characteristics in a multinormal model. Iie Transactions, 32, 1115-1125.

Shewhart, W. A. (1931) Economic Control of Quality of Manufactured Product, MacMillan and Co., London.

Shiryaev, A. N. (1963) On optimum methods in quickest detection problems. Theory of

Probability and its Applications., 8, 22-46.

Sonesson, C. and Frisén, M. (2005) Multivariate surveillance. In Spatial surveillance for

public health(Eds, Lawson, A. and Kleinman, K.) Wiley.

Stoumbos, Z. G. and Jones, L. A. (2000) On the properties and design of individuals control charts based on simplicial depth. Nonlinear Studies, 7, 147-178. Stoumbos, Z. G. and Sullivan, J. H. (2002) Robustness to non-normality of the

multivariate EWMA control chart. Journal of Quality Technology, 34, 260-276. Sun, K. and Basu, A. P. (1995) A characterization of a bivariate geometric distribution.

Statistics & Probability Letters, 23, 307-311.

Talluri, S. and Sarkis, J. (2002) A methodology for monitoring system performance.

International Journal of Production Research, 40, 1567-1582.

Wessman, P. (1998) Some Principles for surveillance adopted for multivariate processes with a common change point. Communications in Statistics. Theory and Methods,

27, 1143-1161.

Wessman, P. (1999) The surveillance of several processes with different change points.

Research report, Department of Statistics, Göteborg University, Sweden.

Wikström, C., Albano, C., Eriksson, L., Fridén, H., Johansson, E., Nordahl, Å., Rännar, S., Sandberg, M., Kettaneh-Wold, N. and Wold, S. (1998) Multivariate process and quality monitoring applied to an electrolysis process: Part I. Process supervision with multivariate control charts. Chemometrics and Intelligent

Laboratory Systems, 42, 221-231.

(29)

Yumin, L. (1996) An improvement for mewma in multivariate process control*1.

(30)

Research Report

2003:1 Holgersson, T. & Testing for multivariate heteroscedasticity.

Shukur, G.:

2003:2 Holgersson, T.: Testing for multivariate autocorrelation. 2003:3 Petzold, M. & Detection of intrauterine growth restriction.

Sonesson, C.:

2003:4 Bock, D.: Similarities and differences between statistical surveillance and certain decision rules in

finance.

2003:5 Holgersson, T. & A comparison of conditioned versus

Lindström, F.: unconditioned forecasts of the VAR(1) process. 2003:6 Bock, D.: Early warnings for turns in business cycles

and finance.

2003:7 Lindström, F.: On prediction accuracy of the first order vector auto regressive process.

2003:8 Petzold, M. & Maximum Likelihood Ratio based small-sample Jonsson, R.: tests for random coefficients in linear

regression.

2003:9 Petzold, M.: Preliminary testing in a class of simple non-linear mixed models to improve estimation accuracy.

2003:10 Frisén, M. & Graphical evaluation of statistical surveillance.

Gottlow, M.:

2003:11 Frisén, M.: Statistical measures for evaluation of methods for syndromic surveillance.

2003:12 Andersson, E.: A monitoring system for detecting starts and declines of influenza epidemics.

2004:1 Bock, D., van Dijk, D. Monitoring macroeconomic volatility. & Franses, P.H.:

References

Related documents

fund performance Surveillance 5 portfolio performance stopping 3 fund performance change point 1 portfolio performance surveillance 3 fund performance stopping 1

In Section 3, some commonly used optimality criteria are described, and general methods to aggregate information sequentially in order to optimize surveillance are discussed.. One

For the conditional model with an observation before the possible change there are sharp results of optimality in the literature.. The unconditional model with possible change at

In Sweden, two types of data are collected during the influenza season: laboratory diagnosed cases (LDI), collected by a number of laboratories, and cases of influenza-like

Theorem 2: For the multivariate outbreak regression in Section 2.2 with processes which all belong to the one-parameter exponential family and which are independent and identically

Predictions by early indicators of the time and height of yearly influenza outbreaks in Sweden.. Eva Andersson 1

Here a simple method based on quantiles (Q method) is compared with the Maximum Likelihood (ML) method when estimating the parameters in censored two-parameter Weibull

Severin and Schmid (1998), Severin and Schmid (1999), Schipper and Schmid (2001a), and Schipper and Schmid (2001b) compared the performance of different versions of the CUSUM,