• No results found

Computer Support Simplifying Uncertainty Estimation using Patient Samples

N/A
N/A
Protected

Academic year: 2021

Share "Computer Support Simplifying Uncertainty Estimation using Patient Samples"

Copied!
69
0
0

Loading.... (view fulltext now)

Full text

(1)

Institutionen för Medicinsk Teknik

Department of Biomedical Engineering

Examensarbete

Computer Support Simplifying Uncertainty

Estimation using Patient Samples

Examensarbete utfört i Medicinsk Informatik vid Tekniska högskolan i Linköping

av

Stein Norheim

LiTH-IMT/MI20-EX--08/460--SE

Linköping 2008

Department of Biomedical Engineering Linköpings tekniska högskola Linköpings universitet Linköpings universitet SE-581 85 Linköping, Sweden 581 83 Linköping

(2)
(3)

Computer Support Simplifying Uncertainty

Estimation using Patient Samples

Examensarbete utfört i Medicinsk Informatik

vid Tekniska högskolan i Linköping

av

Stein Norheim

LiTH-IMT/MI20-EX--08/460--SE

Handledare: Prof. Elvar Theodorsson

ike, Linköpings universitet

Dr. Ylva Nilsson

isy, Linköpings universitet

Examinator: Prof. Hans Åhlfeldt

imt, Linköpings universitet

(4)
(5)

Avdelning, Institution

Division, Department

Division of Medical Informatics Department of Biomedical Engineering Linköpings universitet

SE-581 85 Linköping, Sweden

Datum Date 2008-01-25 Språk Language  Svenska/Swedish  Engelska/English  ⊠ Rapporttyp Report category  Licentiatavhandling  Examensarbete  C-uppsats  D-uppsats  Övrig rapport  ⊠

URL för elektronisk version

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-68278

ISBN

ISRN

LiTH-IMT/MI20-EX--08/460--SE

Serietitel och serienummer

Title of series, numbering

ISSN

Titel

Title Datorstöd för att underlätta skattning av analytisk mätosäkerhet medpatientprover Computer Support Simplifying Uncertainty Estimation using Patient Samples

Författare

Author Stein Norheim

Sammanfattning

Abstract

In this work, a practical approach to assessing bias and uncertainty using patient samples in a clinical laboratory is presented. The scheme is essentially a split-sample setup where one instrument is appointed to being the “master” instrument which other instruments are compared to. The software presented automatically collects test results from a Laboratory Information System in production and cou-ples together the results of pairwise measurements. Partitioning of measurement results by user-defined criteria and how this can facilitate isolation of variation sources are also discussed. The logic and essential data model are described and the surrounding workflows outlined. The described software and workflow are currently in considerable practical use in several Swedish large-scale distributed laboratory organizations. With the appropriate IT-support, split-sample testing can be a powerful complement to external quality assurance.

Nyckelord

Keywords analytical uncertainty, split-sample, quality assurance, method comparison, bias estimation, patient samples, mentor principle

(6)
(7)

Abstract

In this work, a practical approach to assessing bias and uncertainty using patient samples in a clinical laboratory is presented. The scheme is essentially a split-sample setup where one instrument is appointed to being the “master” instrument which other instruments are compared to. The software presented automatically collects test results from a Laboratory Information System in production and cou-ples together the results of pairwise measurements. Partitioning of measurement results by user-defined criteria and how this can facilitate isolation of variation sources are also discussed. The logic and essential data model are described and the surrounding workflows outlined. The described software and workflow are currently in considerable practical use in several Swedish large-scale distributed laboratory organizations. With the appropriate IT-support, split-sample testing can be a powerful complement to external quality assurance.

(8)
(9)

Acknowledgments

This thesis is submitted for the degree of Master of Science in Applied Physics and Electrical Engineering at Linköping University. I wish to express my gratitude to everyone who has, in various ways, contributed to this work. In particular, I want to mention:

Prof Elvar Theodorsson, who supervised the work from design to

implementa-tion and provided valuable comments to this report. He is a great business partner and father.

Dr Ylva Nilsson. Ylva’s supervision and feedback during the writing process

was indispensable for this thesis to ever be written.

Prof Hans Åhlfeldt, who has kindly agreed to be examiner for this work, and

also provided useful comments on this report.

Margareta Karlsson, Elisabeth Gustavsson, Peter Helleberg, Gunilla Weine-holm and their colleagues, whose hard work and valuable feedback during and

after installation made the implementation successful.

TietoEnator, that adopted the product and made it available to a wide range of

users. In particular, I want to mention John Rönnbäck, who took the initiative of acquisition and became my boss at TietoEnator, and Fredrik Öberg that currently is the maintainer of the software.

Andreas Fältskog, my opponent.

Ebba and the rest of my family for their love and support.

(10)
(11)

Contents

1 Introduction 1

1.1 Are You Feeling Better Today? . . . 1

1.2 Trapped in the Matrix . . . 1

1.3 And a Master Thesis? . . . 2

2 Basic Theory 3 2.1 Analytical Error and Uncertainty . . . 3

2.1.1 Systematic Errors (Bias) . . . 3

2.1.2 Random Errors . . . 4

2.1.3 Blunders . . . 4

2.1.4 Uncertainty . . . 4

2.1.5 What Causes Uncertainty? . . . 5

2.1.6 Uncertainty Estimation According to GUM . . . 5

2.1.7 Uncertainty Estimation Using Patient Samples . . . 6

2.2 Current Quality Assurance Programs . . . 6

2.2.1 Conventional Control Samples . . . 6

2.2.2 External Quality Assurance . . . 7

2.2.3 Internal Quality Assurance . . . 8

3 The Mentor Principle 9 3.1 Selecting Samples . . . 9

3.1.1 Impact of Time . . . 10

3.2 Data Partitioning . . . 10

3.2.1 Organizational Partitioning . . . 10

3.2.2 Geographical Partitioning . . . 10

3.2.3 Partitioning Based on Analytical Method . . . 10

3.2.4 Presentation Based Partitioning . . . 11

3.2.5 Partitioning Groups Throughout the Application . . . 11

3.3 Identifying Mentor Samples . . . 12

3.3.1 How are the Bar Codes Assigned? . . . 12

3.3.2 Practical Approach . . . 12

3.4 Retrieving the Results . . . 13 ix

(12)

4 Data Examination 15

4.1 Statistical processing . . . 15

4.1.1 Normalization . . . 15

4.1.2 Covariance Analysis . . . 16

4.1.3 Estimating the Master Instrument Random Error . . . 17

4.2 Graphical Presentation . . . 18 4.2.1 Scatterplot . . . 18 4.2.2 Bias Plot . . . 18 5 Implementation 21 5.1 Architectural Decisions . . . 21 5.1.1 Borland Delphi . . . 21 5.1.2 InterBase/Firebird . . . 21 5.1.3 Two-Tier Solution . . . 22 5.2 Data Acquisition . . . 22

5.2.1 Automatic Data Acquisition . . . 22

5.2.2 Manual Data Entry . . . 23

5.2.3 Combining Master and Adept Results . . . 24

5.3 Data Structures . . . 25

5.3.1 Storage of Control Results . . . 25

5.3.2 Default Expected Concentration and Limits . . . 26

5.3.3 Sequence Activity Period . . . 26

5.3.4 Data Partitioning Groups . . . 27

5.3.5 Caching the Group Tree . . . 29

5.3.6 Authorization and Authentication . . . 30

5.3.7 Definition of Mentor Participation . . . 30

5.4 Statistical Summaries . . . 31

5.4.1 Selecting Data to be Processed . . . 32

5.4.2 Summaries Based on Raw Data . . . 32

5.4.3 Result Aggregation . . . 32

5.4.4 User Interface . . . 35

5.5 Panorama View . . . 35

6 Results and Discussion 39 6.1 Real-World Usage . . . 39

6.2 Benefits . . . 39

6.2.1 Quality Improvement . . . 39

6.2.2 Information Availability . . . 40

6.2.3 Economy . . . 40

6.2.4 Innovative Usage with EQA samples . . . 40

6.3 Sources of Errors . . . 40

6.3.1 Typing Errors . . . 40

6.3.2 Wrong Laboratory IDs . . . 41

6.3.3 Samples Never Reaching the Master Instrument . . . 41

6.4 Comparison with Other QA Measures . . . 41

(13)

Contents xi

6.5.1 Two-Tier Solution . . . 42

6.5.2 Firebird as RDBMS . . . 42

6.5.3 Borland Delphi as Development Platform . . . 43

6.6 Current Status of the Product . . . 43

6.7 Suggestions for Future Development . . . 44

6.7.1 Simplified Mentor Sample Handling . . . 44

6.7.2 Avoiding Data Replication . . . 44

6.7.3 Automatic Calculation of Limits . . . 45

6.7.4 Web Application . . . 45 7 Conclusions 47 Bibliography 49 A Database Diagrams 51 A.1 Entities . . . 51 A.2 Relationships . . . 51

A.3 Association attributes . . . 52

(14)
(15)

Chapter 1

Introduction

1.1

Are You Feeling Better Today?

Imagine a meeting with a doctor at your local health center. A couple of blood samples are collected from you, which are analyzed immediately at the point-of-care instruments and the results arrive only fifteen minutes later. Neat, isn’t it? Based on the results, which unfortunately did not look that good, the doctor ar-ranges for you to meet a specialist at the main hospital. One week later, at the hospital, new samples are collected and processed at the main and large scale lab-oratory. Now, the question is: Can you tell from comparing these results whether

your status has changed or not? Does the change require actions to be taken in the treatment? How much must two test results disagree in order to be significantly

different? The answer does of course depend not only on the normal inter-personal variation of the analytes in question but also on the possibility of a bias between the two different analysis methods and equipments, and furthermore on the effect of various random errors. As long as this is not known, measured values from different methods should only be compared and acted on after gathering sufficient evidence.

This thesis work aims at providing means for the health care organization to es-timate the bias and random errors inherent in the measurement results throughout the entire organization.

1.2

Trapped in the Matrix

During measurement, you will, regardless of whether you want it or not, need to make a number of assumptions about the environment and sample you are per-forming the measurement in. As a simple example, say that you need to measure the amount of milk in a bottle. If you know the exact weight of the bottle and the density of milk, you could probably do this by weighing the bottle with milk and everything. Given that the bottle weight and milk density is constant, there would be a linear relationship between the total weight and the volume of the milk, and

(16)

the scales could, at least in theory, be calibrated in order to directly show the milk volume. Now, how is that done? One way would be to examine every step of the measurement process and calibrate the various steps individually. You could calibrate the scales weighing bottles containing known amounts of water, and then translate the values into the corresponding amounts of milk, given the assumed density of milk which has previously been determined with high accuracy and precision. Any error during each step will certainly influence the final result. If precision needs to be high, you will also need to take into consideration that the density of milk does depend on the percentage of fat in the milk, which in turn can depend on the season and the cows producing it.

Another way of measuring the volume would be the simpler approach of using a measuring glass. The fluid is poured into the glass, and the amount of milk is determining by reading checking the level of the surface against the scale printed on the glass. This method is not at all sensible to influence by variations in the density, but does on the other hand have the drawback of requiring the fluid to leave its original container when its volume is being determined. This requirement could make the method unusable in everyday use, due to the additional handling or contamination risks.

Ideally, the two methods should yield the same result when measuring the same milk, but one of them is sensible to changes in the fluid itself. For instance, adding a conservation agent into the milk will probably alter its density, making the methods disagree on the volume. Altering the sample in this way is called altering the matrix of the sample, and is a common source of measurement errors. Whether or not this difference is significant depends both on the overall precision of the methods, and the needs of the user who receives the values.

1.3

And a Master Thesis?

This work aims at providing a way of comparing various methods by using the kind of samples that the methods were designed to provide correct results for – patient samples. Each sample is analyzed twice, once at the instrument whose proficiency we want to test, and once at a designated master instrument. The results from these duplicate tests are collected and the systematic and random errors (the uncertainty) are estimated from the results. A piece of software and logistic to facilitate this workflow has been developed and run in every-day use on several large-scale clinical laboratories.

(17)

Chapter 2

Basic Theory

This chapter describes the concepts of analytical error and uncertainty, what the main sources of analytical errors are, and the most common measures taken in order to monitor and improve the analytical quality.

2.1

Analytical Error and Uncertainty

Even though most people have intuitive impressions of the concepts of uncertainty and error, these do not always agree with the formal definitions being used in the science of metrology. Therefore, a brief introduction is given here. The classifica-tion of errors is of special importance.

Two common concepts are error and uncertainty. There are many types of errors that can occur, but this work focuses on the analytical error, thus ignoring error sources such as faulty sampling or inappropriate handling of the specimen prior to their arrival at the laboratory. The analytical error [QUA00] of a measure-ment is defined as the difference between the reported value and the true value. In general, we do not know the exact true value, but by comparing with a measuring device known to have high precision, we can get a rather good estimate of the error.

Analytical errors are generally divided into two main categories, namely

sys-tematic and random errors. In addition to this, some errors are categorized as spurious or blunders.

2.1.1

Systematic Errors (Bias)

Systematic errors will affect any repeated measurements in the same manner, given that the environmental conditions and the overall state of the measuring device are constant. When the systematic errors are known and predictable, their influence can be compensated for, and thereby eliminated from a measurement result. It is, however, the impression of the author that this is not always done in practice even though it would improve the quality of the measurements. It is worth noting that the systematic error may be dependent on the level of the measurement value

(18)

[QUA00]. Unknown systematic errors, on the other hand, can not be compensated for. A major reason for the present project was to establish practical computerized methods for demonstrating true systematic errors/bias in order to establish causes for eliminating them.

2.1.2

Random Errors

The random error is defined [VIM93] as:

[The] difference of quantity value obtained by measurement and aver-age that would ensue from an infinite number of replicated measure-ments of the same measurand carried out under repeatability

condi-tions

As a direct consequence of this, the random errors are by definition bias free and have therefore the expected value of 0. The random error can be eliminated by per-forming several consecutive measurements of the same sample. As [Gra05] points out, a large number of measurements is only meaningful when the sensitivity of the instrument is sufficiently high. If the instrument reports only a few significant digits, the errors introduced by rounding might be of the same magnitude as the random measurement error. Furthermore, it will generally not be feasible to, say, triple the number of analyzes in every-day production for economical reasons.

The random error is often expressed in terms of standard deviations, and when estimating it, we shall follow that notation.

2.1.3

Blunders

Blunders, or spurious errors, are in a way random, but are usually treated as a

separate group of errors [QUA00]. These errors can be caused by human failure, such as accidentally mixing up digits, or malfunctioning equipment yielding un-predictable results. These errors are not modeled in [QUA00], but do nevertheless cause patient hazard if uncaught.

2.1.4

Uncertainty

Measurement uncertainty is defined [VIM93] as

A parameter, associated with the result of a measurement, that charac-terizes the dispersion of the values that could reasonably be attributed to the measurand.

The parameter can for instance be a standard deviation or the width of a confi-dence interval [QUA00], and has the form of a range. Its value can generally not be used to correct the result of a measurement.

Even though knowledge of the uncertainty is crucial information when judging a measurement value, the values reported back to the physicians are in practice hardly ever accompanied by any uncertainty levels. Furthermore, physicians are generally not used to dealing with complex uncertainty calculations.

(19)

2.1 Analytical Error and Uncertainty 5

2.1.5

What Causes Uncertainty?

There are numerous factors influencing the reliability of measured values. These include, but are not limited to [QUA00]:

1. Sampling A common source of errors that are tricky to monitor and quantify. The patient sample might for instance have been collected in the wrong type of container or an insufficient amount of the sample might have been taken. 2. Sample preparation Can the sample have been contaminated during

prepa-ration? Is it diluted correctly?

3. Presentation of certified reference materials to the measuring system The ref-erence material might differ somewhat in concentration from what is stated on the label. This will will affect any results depending on it.

4. Calibration of instrument Any uncertainty in the calibration will have influ-ence on the reported result.

5. Analysis (data acquisition) Imprecision inherent in the actual measurement method. Affected also by the purity of the reagent and the risk of carry-over between samples.

6. Data processing e.g. rounding and truncation errors, or uncertainty intro-duced by statistics. Manual calculations might fail due to human factor. 7. Presentation of results For instance uncertainty in the actual uncertainty

estimate, or various formatting errors.

8. Interpretation of results If the result is matched against a set of limits and interpreted from there – can we be sure that the limits is relevant for the patient in question? Can there be other factors present invalidating the limits?

2.1.6

Uncertainty Estimation According to GUM

A substantial scientific literature deals with the issue of estimating the uncertainty of measurement methods. Among these documents, the ISO Guide to the

expres-sion of uncertainty in measurements [GUM95], usually referred to as GUM plays

a principal role. The approach for estimating the total uncertainty, advocated by GUM and in wide use today, involves identifying uncertainty factors for the an-alytical method in question, and estimating the contribution of each of them. In the case of the milk measuring device mentioned in the introduction, this would be the same as creating a list of all effects of things that possibly can contribute to the uncertainty. Such a list would for instance include effects of inaccurate scales, variation in container weight and rounding errors etc. The uncertainties obtained are theoretical numbers which will be reliable, provided that the included factors are correctly estimated and that no vital factors have been omitted. Therefore, you will need detailed knowledge of the analytical process in order to correctly es-timate the uncertainty using this method. Methods requiring detailed knowledge of the entire analytical process are called bottom-up methods.

(20)

Figure 2.1. Shewhart plot illustrating a time series of control sample measurements, where different rule violations have been marked. Under normal circumstances, the control results should be far more stable than in this example.

2.1.7

Uncertainty Estimation Using Patient Samples

This thesis applies a completely different approach to estimating the uncertainty. Instead of a priori predicting the characteristics of an analytical method, measured values of actual patient samples are compared with observations of the same tests performed by a master instrument, whose analytical characteristics are well doc-umented and assumed to be stable. Out of this data, limits for the uncertainty can be derived. How this is done in detail will be described later on. This method belongs to the category of top-down methods which use measurement results to estimate uncertainty.

2.2

Current Quality Assurance Programs

The following sections give an overview on how current laboratories are monitoring the quality of their production.

2.2.1

Conventional Control Samples

Conventional control samples are widely used in clinical laboratories for determin-ing whether or not the measured values for a batch of analyzed samples should be reported. A set of control rules are defined regarding the measured values for the control samples. The rules usually check the control result against predefined-defined limits, and are often named on the form nks where n and k are integers

(e.g. 22s). The rule is violated when n consecutive control samples deviate more

than k standard deviations from the expected mean. Violations are easily detected by human eye if every measurement is plotted in a Shewhart plot (figure 2.1). The figure also illustrates violations of the R4srule, which occurs when two results in

a group differ more than 4 standard deviations, and the 10x¯ rule that detects 10

consecutive measurements all on the same side of the expected mean.

If a control rule is violated, typically the instrument needs to be calibrated and the samples re-analyzed before the measured values can be reported. Figure

(21)

2.2 Current Quality Assurance Programs 7

Figure 2.2. Decision flow chart for deciding whether or not an analytical run should be rejected.

2.2 shows a flow-chart of rules for use when validating control sample results, as suggested by Westgard et al [WBHG81].

True patient samples - samples collected from patients - have different chemical properties compared to control samples. For instance, the durability and stability of a control material must be long enough for the material to reach the laboratory, and to be used in the laboratory for typically in the order of one year. In contrast, ordinary blood samples need to be analyzed as quickly as possible. Analyzing the sample too late, and other factors related to handling of the sample, constitute a common source of errors.

The effect caused by samples having different chemical properties is called ma-trix effect, and limits the possibility of reliably determining the systematical error using control samples alone. Simply put, matrix effects are the contributions to the final result from all other components in the sample except the one intended to be measured. For instance, [TEP+04] describes cases where there were

signif-icant systematical differences between laboratories, even though validating with the manufacturer’s control material indicated no sign of bias.

Even though bias can not be reliably estimated using control samples, calcu-lations on control sample results are routinely used to estimate the random error.

2.2.2

External Quality Assurance

There are several vendors providing services for External Quality Assurance, EQA. Laboratories participating in EQA schemes receive specimen at regular intervals, which they are supposed to analyze and report back the measured values for. In general, the participants are not aware of the true expected value.

The EQA provider collects the results and reports back to each laboratory how they performed in comparison with other participants performing the same tests. The ability to provide natural samples, without any addition of stabilizing

(22)

substances, is a major advantage of EQA, but does require short latency between production and analysis of the material.

EQA is an effective way of estimating systematical errors even though partici-pation can be rather costly if many instruments are involved. Furthermore, there is currently very limited if any support for handling EQA samples and results in the major Laboratory Information Systems (LIS) in use at Swedish laboratories. This leads to a considerable amount of manual work needing to be performed.

Equalis AB, a leading EQA provider in Sweden provides material for external

testing of chemistry analyzes 10 times a year [EqP07], while participants of the corresponding schemes provided by LabQuality OY report results every month. The specimen in the latter case are, however, distributed for several months at the time four times a year. [LqP07]

2.2.3

Internal Quality Assurance

This thesis describes a methodology for estimating the uncertainty of laboratory results by performing an additional analysis of selected patient samples on an ap-pointed master instrument. The master instrument is an analytical instrument within the laboratory that is carefully monitored and assumed to be stable. By pairwise comparing the organization’s analytical instruments with the master in-strument, used as a common reference, an indirect comparison between all com-parable instruments in the organization can be carried out, and the uncertainty for all participating instruments, as well as the organization as a whole can be estimated. A crucial advantage of this method is that the confounding effects of the changed matrix (1.2) in stable control samples are completely avoided.

(23)

Chapter 3

The Mentor Principle

We call the principle of appointing an instrument as the master instrument which other instruments’ results are compared to the mentor principle. The NCCLS guide [fCLS93] names this instrument the comparative method. The instruments being compared to the master (mentor) instrument are called adepts. This naming is not widely established, but suggested [TN01] in the absence of a widely accepted nomenclature. In order for the mentor principle to work, the master instrument must indeed itself be reliable and stable over time. This can be assured by giving it extra attention and assuring its production quality through the use of both conventional control samples and one or several EQA programs. One instrument can not possibly serve as a master for all tests performed within the organization, as it is limited to the set of tests that it supports. Therefore, there will probably be several master instruments within the organization, but each test will only have one specific master assigned. From an economical point of view, it is beneficial if the cost per sample analyzed by the master instrument is low.

The following sections describe the practical workflow performed when han-dling samples participating in the mentor program.

3.1

Selecting Samples

At every participating site, patient samples are selected at regular time intervals and sent to the master instrument for re-analysis. The selection is random in the sense that any sample could become a candidate for being sent, but the perfor-mance will be better if the latency between the adept and master analysis of the sample is short. Also, as testing a broad range of concentrations will improve the quality of the bias estimation, samples known to have widely spread concentra-tions for the tests of interest should be chosen. The underlying intention of this recommendation is that as large parts of the instrument’s supported measurement interval as possible should be covered. The original measurement result reported by the adept instrument is retained, which means that no extra analysis is re-quired by the adept instrument. This can be of relevance, especially for point of care instruments whose cost per measurement is higher.

(24)

3.1.1

Impact of Time

Components of the analyte differ in how they change over time. While some are reasonably stable, other change rapidly, which might increase the uncertainty of the monitoring. Therefore, the time interval between the measurements in the adept and master instruments should be as short as possible. There are a number of ways to handle this. Usually, samples are sent to the mentor instrument imme-diately after analyzing it in the adept instrument. This works well for components with long decay times. As a rule of thumb, the tests should always be performed during the same day.

An alternative, which is assumed to further improve the precision, would be to split the specimen into two tubes and analyze it simultaneously at the the mentor and adept laboratories.

3.2

Data Partitioning

Depending on the scope of the quality assessment, various data sets are included in the data set when analysis of variance is performed. The scope can be anywhere between analyzing the performance of a specific assay to estimating the uncertainty of an entire laboratory organization.

3.2.1

Organizational Partitioning

Large laboratory organizations may have multiple divisions with different quality assurance officers in charge of each. Such divisions might in turn be constituted of several sub-divisions which in some cases will be scattered over a large geographical area. For instance, the laboratory activity performed in primary care centers might belong to another unit than the analyzes done at a central laboratory.

3.2.2

Geographical Partitioning

Laboratory workers typically only handle a small subset of all instruments present within the organization. It might be handy for them to have the results produced at their local site easily accessible. If groups are created based on the instruments being present at each site respectively, the laboratory personnel can evaluate their own results, without having to see the other measurements being produced unless they really want to. Figure 3.1 shows a schematic view of one central laboratory, called main and three satellite laboratories all representing different geographical locations, Lab A, Lab B and Lab C respectively. Each laboratory in turn contains several units.

3.2.3

Partitioning Based on Analytical Method

It can be of interest to compare groups of measurement results that have been analyzed using different methods. For instance, the concentration of sodium (Na)

(25)

3.2 Data Partitioning 11

Figure 3.1. Partitioning of the instruments in a laboratory organization into geograph-ical groups.

in plasma can be determined either using flame photometry, absorption

photom-etry, or ion-selective electrode based methods. The instruments participating in

the EQA (2.2.2) schemes provided by Equalis and LabQuality, and probably other vendors as well, are primarily compared to instruments applying the same analyt-ical method.

3.2.4

Presentation Based Partitioning

The final category of partitionings mentioned here is the one that is simply based on whatever data the operator wants to see grouped together. The previously de-scribed partitionings could of course also be considered being presentation based. In some cases, the user might want to produce statistical summaries (5.4) contain-ing specific analyzes from selected instruments. Creatcontain-ing a group containcontain-ing only the data of interest can save the user from tedious cut and paste work.

3.2.5

Partitioning Groups Throughout the Application

As you will see later on, the data partitioning groups are not only used when performing Variance Analysis, but provide powerful means of structuring the data throughout the application. In addition to the statistical summaries (5.4) and graphical panorama views (5.5), it is used for selecting where to add manually reported results (5.2.2) and for specifying access permissions (5.3.6).

(26)

Figure 3.2. Composition of a barcode containing prefix, instrument ID and serial num-ber.

3.3

Identifying Mentor Samples

The samples are usually identified by bar code labels, and the system identifies multiple test runs of the same sample by detecting multiple results reported for the same bar code. If two results of the same test type are reported on the same sample but by different instruments, of which one is defined as the master instrument, they can be used for quality assessment.

Identification of samples must be possible even in cases when the samples have no bar code tags. At health centers, a sample is often analyzed immediately after collection by very simple analyzers lacking bar code readers. In these cases, the samples have to be assigned bar codes prior to being sent to the master instrument, and the adept results need to be registered manually for each sample.

3.3.1

How are the Bar Codes Assigned?

The samples used within the mentor program need to be readily identifiable by the computer system as mentor samples. When the mentor principle was introduced in the first county, Theodorsson suggested a simple system to be used for composing bar codes. A bar code should consist of three parts, namely a prefix, the adept instrument identity, and the serial number within the mentor series. The prefix would make the samples identifiable to the analyst, and in most cases 99 was the number of choice. The instruments in the county were all identified by numbers with up to four digits. When necessary, these identities were left padded with zeros to the length of four, forming the identity to be used. The serial number used in the last two digits always started at 00 and would, for instance, cover all numbers up to 07 in case eight different bar codes were to be defined. This is illustrated in figure 3.2.

3.3.2

Practical Approach

To simplify assignment of bar codes to test tubes, custom printed bar code sheets were produced for each participating site. The sheets were designed using Mi-crosoft Excel in conjunction with readily available barcode fonts, and printed by an ordinary laser printer on paper label sheets. The bar codes were organized in an order allowing the analyst to consequently take the next available label ensur-ing that tube identities were not reused too often, thus avoidensur-ing risks of identity collision. Figure 3.3 shows an example of a layout of such a label sheet.

(27)

3.4 Retrieving the Results 13

Figure 3.3. Schematic layout of a sheet containing paper labels for use at a site having four different barcode IDs assigned, each of them not be used more often than maximum once a week.

3.4

Retrieving the Results

The FlexLab1 Laboratory Information System (LIS) that was already in

produc-tion at the site contains a feature allowing export of any control results to a flat text file upon their arrival in the system. The bar codes assigned for use with the mentor principle were simply registered in the LIS as being control samples. This gave the effect that all mentor results reported from instruments online with the LIS could be automatically retrieved. In the case of instruments not being connected, a graphical interface was developed to allow manual entry of results.

When two results for the same type of test reported on the same (mentor) sample arrive, they are to be coupled, if some additional criteria are fulfilled. The time stamps of the results must show that they have been analyzed with less than 36 hours in between. Furthermore, at least one of the results need to be produced by the master instrument. Single results or measurements having too large latency between the adept and master analysis are not to be combined.

1The FlexLab Laboratory Information System is maintained and developed by TietoEnator

(28)
(29)

Chapter 4

Data Examination

4.1

Statistical processing

We will focus on two different ways of interpreting the data. Either, the data can be normalized and then treated using well-known statistical formulae, or we can perform covariance analysis with linear regression to obtain a more specific measure of the uncertainty. We use xiand yi to denote the value measured by the

master and adept instruments respectively for specimen i.

4.1.1

Normalization

Normalizing the data refers in this context to expressing the adept measurement

result relative to the master result. This is an intuitive approach that can be easily explained and that is used in, for instance [PCCA06].

After this transformation is done, the mean and standard deviation can be readily estimated. The calculations involved thus require only very basic statistical knowledge in order to be carried out. We write the transformation as:

ˆ yi=

yi

xi

(4.1) Normalized values are marked with a “hat” ˆ throughout this document. Conse-quently, the mean normalized error and the normalized standard deviation can be calculated as: ˆ¯ y = 1 n n X i=1 ˆ yi= 1 n n X i=1 yi xi (4.2) σyˆ≈syˆ= v u u t n X i=1 (ˆyi− ˆy)¯2 n − 1 (4.3)

The normalized CV, Coefficient of Variation is calculated as the quota between estimated standard deviation and mean.

CVˆy=

syˆ

ˆ¯

y (4.4)

(30)

4.1.2

Covariance Analysis

Even though normalization is an intuitive way of handling the data, it also dis-cards some useful information. For instance, the bias for different subsets of the measuring range can not be calculated using normalized data alone. Also, the linear regression relationship which we obtain will prove very useful in expressing the bias at different levels. This is the way of processing the data advocated by [fCLS93].

The model is based on the assumption that, if random errors are omitted, there exists a linear relationship between the outcome of two instruments measuring the same analyte in the same sample. Even though this does not hold universally, it is very likely to be true, at least within the range of measurement. After all, we want the instruments to report the same result for the same sample, that is a one to one relation which is linear. And it is sensible to assume that deviations from this are not worse than that they can be approximated by a linear equation.

The assumed relationship between a measured value yiby the adept instrument

and the corresponding master instrument value xi is therefore:

yi= αxi+ β + ǫi (4.5)

ǫi∼N (0, σ) (ǫi, ǫj) independent when i 6= j

The random error ǫi contains contributions from both the master and adept

instrument. If the contributions from the master instrument is known, this can be compensated for, otherwise, the composite random error still serves as an upper limit for the random error.

Furthermore, xi is subject to the random and systematic errors of the master

instrument. We are not claiming that xi is the true value, but rather assuming

that the systematic errors are low and that the random errors are small. Also, we assume that the distribution of random errors is independent of the value of xiand

yi. We denote the random component of the master and adept instruments ǫxi∼

N (0, σx) and ǫyi∼N (0, σy) respectively and assume that these are independent

of each other. We have (see appendix B) that

ǫi= ǫyi−αǫxi (4.6)

Given this and the normality and independence assumptions above, we get: σ2 = σ2 y+ α 2 σ2 x (4.7)

The mean values of master and adept instruments, ¯x and ¯y respectively, are of course related: ¯ y = α¯x + β + n X i=1 ǫi n (4.8)

But of even greater interest are the estimated parameters α and β and the other in-formation revealed by the covariance analysis. There are several methods available for estimating them. In this case, classical linear regression was chosen [Sea87].

(31)

4.1 Statistical processing 17

This is equal to minimizing the sum of squares SSefrom the line formed by α and

β to the given measurements. SSe=

n

X

i=1

(yi−αxi−β)2 (4.9)

It can be shown [Sea87] that this is fulfilled by: α = Pn i=1(xi−x)(y¯ i−y)¯ Pn i=1(xi−x)¯ 2 (4.10) β = ¯y − α¯x (4.11) Once α and β are estimated, we receive an estimate of the systematical error Bc, relative to the master instrument:

Bc(x) = αx + β − x = (α − 1)x + β (4.12)

The residual square sum SSeexpresses the variation of yithat can not be explained

by its linear relation to xi. We can in fact use SSe to calculate an estimate of

the random error. As we have estimated two parameters, α and β, the number of degrees of freedom is n − 2. σ2 ≈s2 e= SSe n − 2 = 1 n − 2 n X i=1 (yi−αxi−β)2 (4.13)

The random error obtained from SSe is influenced by the random error of the

master as well as the adept instrument. As we assume that these errors are independent and Gaussian, we can estimate the random error from rewriting (4.7): σy =pσ2e−α2σx2≈ps2e−α2σx2 (4.14)

4.1.3

Estimating the Master Instrument Random Error

The random error σx of the mentor instrument can either be estimated

accord-ing to the recommendations in the GUM document [QUA00] (see 2.1.6), or by observing the measurement values delivered by the instrument under stable oper-ation. After all, why not use both methods? It is the same number that is being sought, regardless of the methodology in use. In order to estimate the random error based on analyzing the production, results obtained from analysis of con-trol samples can be used. Assume that you have collected concon-trol result data for m different expected levels, and calculated the sample standard deviations s1..m

for this data. For level j we assume that nj measurements were collected. The

standard deviations can then be combined by pooling: sx=

s Pm

i=1(ni−1)s2i

(32)

As an example, consider having run n1, n2 and n3 control samples of three

different concentration levels. The sample standard deviations s1, s2 and s3 are

obtained. The pooled standard deviation can then be calculated as follows: sx= s (n1−1)s21+ (n2−1)s22+ (n3−1)s23 n1+ n2+ n3−3 (4.16)

4.2

Graphical Presentation

The NCCLS guide Method Comparison and Bias Estimation using Patient

Sam-ples [fCLS93] proposes two kinds of graphs for visualizing the data, the scatterplot

(figure 4.1) and the bias plot (figure 4.2).

4.2.1

Scatterplot

The scatterplot is a diagram where each measurement is represented by a dot whose x-coordinate is determined by the master instrument’s measurement, and the y-coordinate by the result of the adept instrument. It is usually easy to verify by looking at the scatterplot that there in fact exists a linear relationship between the results from the adept and master instrument. Extreme outliers can also be detected by simple visual inspection. It is of great use to visualize the linear regression line in the graph as well. In the absence of systematical errors the linear regression line will have unit slope and no intercept.

4.2.2

Bias Plot

The bias plot is in many ways similar to the scatterplot, except that it is the bias yi−xithat decides the location on the y-axis instead. Even though the information

on the size of the deviation is also contained in the scatterplot, it is much easier to read this from the bias plot, because of the higher resolution on the y-axis.

(33)

4.2 Graphical Presentation 19 80 100 120 140 160 180 80 100 120 140 160 180 Adept concentration [g/L] Master concentration [g/L] Hemoglobin Concentration y=1.04x+-3.68

Figure 4.1. Scatterplot where master concentration is plotted on the x-axis and adept concentration on the y-axis. In this example, two high-volume Advia instruments of the same type are being compared to each other.

-10 -5 0 5 10 80 100 120 140 160 180 Absolute deviation [g/L] Master concentration [g/L] Hemoglobin Bias Plot

Figure 4.2. Bias plot where master concentration is plotted on the x-axis and the deviation of the adept instrument is plotted on the y-axis. The same data is used as in figure 4.1.

(34)
(35)

Chapter 5

Implementation

5.1

Architectural Decisions

This section describes the most important strategic decisions taken during the planning phase.

5.1.1

Borland Delphi

Borland Delphi 4 (and later on 6) was used as integrated development

environ-ment throughout the project. It facilitates developing applications for the 32-bit Microsoft Windows (x86) platform. The programming language in use is called

ObjectPascal and is an object-oriented successor of the well known Turbo Pascal.

The main reason behind this choice is my previous familiarity with the product and language. Having to learn a new development platform would have considerably delayed any outcome.

5.1.2

InterBase/Firebird

Borland InterBase was initially selected for storing the database. This choice was

also mainly selected based on previous experience with the product, primarily from lab courses at university, but also because of Borland Delphi’s excellent support for communicating with it. Borland all of a sudden decided to release the source code of InterBase under an open-source license. The Firebird database is essentially a distribution of InterBase based on this source code, but not maintained by Borland. Firebird 1.0 is for all practical purposes identical to InterBase 6.0, except for some additional bug fixes and easier client deployment. InterBase was for this reason replaced by Firebird, even though this did not noticeably affect the design of the product.

Development first started using Borland Database Engine for connecting with the database. It was however for performance and deployment reasons replaced with Jason Wharton’s InterBase Objects component (www.ibobjects.com).

(36)

5.1.3

Two-Tier Solution

The solution was implemented as a two-tier solution. This means that the client software communicates directly with the database. (See 6.5.1 for a deeper analysis of this.) The underlying reasons were to focus on the actual problem to be solved, and to utilize the rich set of tools available in Borland Delphi for Rapid Application Development.

5.2

Data Acquisition

The most convenient method of collecting the data is by far extracting them from a Laboratory Information System. Practically all laboratories collect their test results in some kind of on-line storage nowadays, and much labor can be avoided in this way. For monitoring the quality of laboratories producing millions of test results every year, manual handling is not an option except for possibly a few low-volume instruments.

Figure 5.1. Automatic and manual data acquisition. Instruments on-line with a Labo-ratory Information System (LIS) report their data in the usual way, and the LIS passes it on to the importer process. Results from off-line instruments need to be registered manually, preferably through a graphical user interface.

5.2.1

Automatic Data Acquisition

Data is extracted by connecting to an already existing functionality of the LIS that produced flat files on the format initially used with the MS-DOS version of Quality Management. This program visualized conventional control sample results and evaluated ordinary Westgard-rules [WBHG81], as well as doing basic statistical calculations on the collected data. Thus, the LIS did already contain working functionality producing flat text files at regular intervals and making them available for download by FTP.

When data acquisition is initiated, the importer process initiates an FTP ses-sion with the LIS server where the files are stored. It browses through the available files to see if any new files have arrived or if new data has been added to any of the existing files. This check is done on basis on file names and sizes. The importer

(37)

5.2 Data Acquisition 23

Start Length Description Example

0 4 Instrument ID 367

5 6 Test ID P-Na

12 20 Test name P-Sodium

33 8 Laboratory ID 9999502

42 10 Time stamp (YYMMDDHHMM) 0112241500

53 8 Concentration 149

62 8 Expected conc. 148

71 8 CV within day 0.96

80 8 CV between days 0.96

89 10 Signature STNO

99 - Memo Tightened limits

Table 5.1. Format of flat text files used for importing data into the control result database

process keeps information about which files were present the last time it visited the file server, and automatically downloads any files having changed since then.

During parsing of the flat text files, one line is processed at the time, and the contents of each record field is extracted from the line and inserted into the control result storage. The structure of the flat text files is given in table 5.1.

There is no special field in the import format for indicating exclusion of indi-vidual sample results. This is instead indicated by setting the character x in the last position of the Signature field.

One limitation of this protocol is that it is rather cumbersome to tell whether or not a result has already been imported into the control database. If the server produces one file a day that grows increasingly, this will cause the application to download the same file over and over again. For each result present in the file, it must then decide whether or not it is already imported. This was solved by the assumption that an instrument will not produce tests of the same samples having the same results during the same exact time instant. Even though this by itself is a sensible assumption, it gets complicated by some data being reported batch-wise from the LIS giving them all the same time stamp. Occasionally, this can lead to loss of control results during import.

5.2.2

Manual Data Entry

Two different interfaces are provided for manual data entry. The first of them constitutes the main window of the client application, see figure 5.2. It has been implemented as a data-aware grid having a direct connection to the database. The user enters control results manually by adding more rows in a simple and familiar fashion. This simple implementation has worked surprisingly well, even in wide area networks, but only allows entry of results not needing to be coupled together. This is a consequence of the grid’s direct coupling to the database.

In order to remedy this, a specific form for manual entry of adept/master results had to be added. It is very simple in its layout, and provides entry of

(38)

Figure 5.2. Main window of client software allowing manual entry of control data. The results already reported are shown in the right-hand grid.

individual results, completely lacking any fancy features. The user would enter the instrument and sample IDs, and the IDs of the tests along with the reported values for each test performed. The application then checks whether the sample ID belongs to the set of designated mentor IDs. If not, the value can directly be added to the control result database without any specific modifications. On the other side, if it is a master or adept result, it must be stored in an intermediate table waiting to find a measurement to be coupled with instead.

5.2.3

Combining Master and Adept Results

This is to demonstrate how the matching of master and adept result is performed during import. The mentioned database tables are explained in sections 5.3.7 and 5.3.1.

Whenever a result enters into the system, regardless of whether it has been input manually or automatically transmitted from the Laboratory Information System, the Laboratory ID of the sample is matched to the identities listed in the

MentorLids table. If a match is found, it is added to the MentorResults table,

otherwise it is added to the Result table, at it is not a candidate for mentor matching anyway.

After a result has been added to MentorResults, the system checks whether there are other results in the table that the new result can be matched to. For matching to take place, both the test ID and the laboratory ID of the result must match. Furthermore, the results to combine must not differ more than 72 hours in time stamp, and one of the results must come from the master instrument.

(39)

5.3 Data Structures 25

When a result matching the recently added result is found, these are combined to one result entry, having the adept concentration as the measured concentration, and the master instrument’s measurement as the expected concentration. The time stamp of the combined result is set to the time stamp of the adept measurement, and the Laboratory ID of the combined result is set to the Laboratory ID of the

mentor group (see 5.3.7) that the sample belongs to.

5.3

Data Structures

The intention behind this section is to describe how the data are stored in the database. Focus has been on describing the logical relationships and types rather than giving the exact description of the fields and indexes used. For instance, properties accepting the values true and false as their only values have been de-scribed as boolean, because that is their logical function, rather than int which due to technical limitations were used to represent them in the database. Also, a small notice about the naming convention used when baptizing the database tables: The tables have been named as the plural form of the entity they are containing records of. For instance, the table holding Method entries is called Methods. The notable exception to this is the result table which for no particular reason is called Result. In the UML diagrams, the entity name rather than the corresponding table name is given.

5.3.1

Storage of Control Results

Conventional control results and coupled mentor results are stored in a single table,

Result, where every row represents a measurement. Figure 5.3 shows the fields of Result and its related entities.

For every measurement, the observed concentration (Conc) and point in time of the analysis (TimeInfo) are stored. The result is also accompanied with its ex-pected concentration (ExpConc). For results created by mentor coupling (5.2.3), the master instrument provides the expected value. Each result also contains in-formation about its expected variation around the expected value. This is specified as coefficient of variation within (CV_WD) and between (CV_BD) days. If the analyst has decided that the result is a clear outlier that should be excluded from statistical summaries, it can be flagged for exclusion (Excluded). This will prevent the measurement from trigging control rules and affecting statistics. The reason for exclusion should be stated in the Memo of the result. Each measurement needs to be linked to the measuring device (Method) that generated it, and also to the

Component whose concentration was determined. Furthermore, the result must

be linked to the Laboratory ID (LID) of the sample. For performance reasons, the result table is not directly linked to these tables, as this would lead to unnecessary data repetition. Instead, the notion of Sequences was introduced. A sequence is defined as a unique combination of Component, Method and LID. Each such combination present among the results is generated a numeric key and added into the Sequences table. The result records are then instead linked to the Sequence they belong to. The concept of sequences is not established outside this work.

(40)

Sequence Sequence : int ExpConc : double CV_WD : double CV_BD : double Component Component : String Name : String CV_B : double 0..* 1 Component Method Method : String Name : String 0..* 1 Method LID LID : String Name : String AliasFor : String Normalize : boolean 1 LID 0..* Result TimeInfo : DateTime Conc : double ExpConc : double CV_WD : double CV_BD : double Excluded : boolean Signature : String Memo : String 0..* 1 Sequence

Figure 5.3. Diagram of data model representing the results stored in the database. The table containing results does not refer directly to Component, Method and LID, but rather points at them through the sequence number of each result.

5.3.2

Default Expected Concentration and Limits

For each Sequence, the default expected concentration (ExpConc) and coefficient of variation within and between days (CV_WD and CV_BD respectively) can be specified. Although the import file format (see 5.2.1) supports receiving these values from the Laboratory Information System, the possibility of defining these values for each sequence provides an option when the incoming data lack this information. The values of ExpConc, CV_WD and CV_BD in the Sequence record are used when no such information has been received from the LIS and for manually registered measurements. For mentor results, the CV_WD and CV_BD is picked from the sequence.

5.3.3

Sequence Activity Period

The introduction of the Sequences table also brings the benefit of making it easier to retrieve information such as:

• Which control materials are in use for a specific instrument and test? • Which tests is a specific control material used for?

• Which instruments perform a specific test (on a specific material?)

When the Sequences table is present, there is no need to go through the entire

Result table in order to find these answers. The questions can then be answered

only by querying the Sequences table.

There is a time aspect to this as well. Typically the user would only be in-terested in seeing the control materials and instruments currently in use. The

Sequences table, on the other hand, contains all combinations of Component, Method and LID that exist throughout the data. Unless some additional measures

(41)

5.3 Data Structures 27 Sequence SequenceCacheInfo StartDate : Date EndDate : Date 1 Sequence 0..1

Figure 5.4. Relation between the Sequences and SequenceCacheInfo tables.

To remedy this, an additional table has been created, SequenceCacheInfo, keep-ing track of the time intervals durkeep-ing which each sequence has reported activity. Its relationship to the Sequences table is shown in figure 5.4. Every time a record is updated or inserted into the result table, a trigger checks whether that record’s time stamp is within the current activity date range. If it is not, it extends the range to cover all dates present among the results for that sequence. Whenever the sequence table is queried, the result can be filtered through the contents of the

SequenceCacheInfo table, only letting through sequences having results within the

time frame of interest.

This has worked very well for data automatically collected. However, if a user by mistake enters an incorrect date for a result, the activity range will be extended and not restored, even though the erratic date has been corrected again. The same situation can arise when a result is registered with incorrect test, instrument or

laboratory ID. That would trigger creation of a new sequence, which in turn shows

up in queries, even though the result itself has been deleted. As the performance benefits outweigh these drawbacks, the solution has been kept with the addition of a clean-up function that on manual invocation ensures that the SequenceCacheInfo contents are up to date with the results.

5.3.4

Data Partitioning Groups

The hierarchy of partitioning groups (3.2) are stored in the GroupTree table. Every node in the tree is given a name and has a designated parent, except for the root node that is defined as the parent of itself. The nodes are identified by generated surrogate keys (GroupNr).

For every node in the tree, the contents can be defined. As every group by definition includes the contents of all its children, it is usually only necessary to add definitions for the leaf nodes in the tree. The definitions are given as statements about what to be either included in or excluded from the particular group. Each statement consists of a triple of the form (Component, Method, LID). An asterisk “*” can be used as wildcard for specifying Component, Method and LID. If an asterisk is found in a placeholder, the inclusion/exclusion will apply to any values in that place. For example, specifying “*” as instrument and material, and 1234 as component will include all control samples with the given component in the group, provided that exclusion is set to false.

(42)

GroupTree GroupNumber : int GroupName : String 1 Parent 0..* GroupDetail Component : String Method : String LID : String Exclude : boolean GroupNumber 1 0..* Sequence 0..* 0..* GroupSequence IsInherited : boolean

Figure 5.5. Representing the tree structure of groups in the database.

(43)

5.3 Data Structures 29

The corresponding Group Tree table for the tree in figure 5.6 on the facing page, where the groups have been assigned the numbers in brackets can be seen in table 5.2. ID Name Parent -1 Root -1 1 Lab -1 2 Brand -1 3 Linköping 1 4 Norrköping 1 5 Motala 1 6 Acme 2 7 ABC 2 8 Lab A 3 9 Lab B 3

Table 5.2. The same tree using expressed using the Group Tree structure

GroupNr Exclude Component Method LID

4 false * 1101 * 4 false * 1102 * 5 false * 1209 * 6 false * 1010 * 6 false * 1012 * 6 false * 1209 * 7 false * 1015 * 7 false * 1101 * 7 false * 1102 * 8 false * 1010 * 9 false * 1012 * 9 false * 1015 *

Table 5.3. Details for the Group Tree

5.3.5

Caching the Group Tree

Although the group tree structure is neat for defining groups and the relationships between them, finding out which sequences that are part of a specific group brings considerable work to the database engine. This is further complicated by allowing the asterisk “*” to be used as wildcard. The solution chosen to reduce this problem is to generate a table, GroupSequences, containing the set of sequences that each group consisted of. That table can then be used for rapid joins during database

(44)

queries. This, of course, comes at the cost of keeping the table up to date with the current group and sequence definitions. Support has been added for automatically connecting newborn sequences to the groups they belong to, and also for deleting group belongings when sequences are removed from the database. However, after altering the definition of the group tree structure or contents, a function to recre-ate the entire content of the GroupSequences table need to be executed. Figure 5.5 illustrates how the GroupSequences table ties together GroupTree nodes and

Sequence entries.

5.3.6

Authorization and Authentication

The system requires some means of authenticating its users. This is commonly done by giving the user a login name and password. The authentication process relies completely on the authentication mechanism of the database. This implies that the user must have a database account in order to access the data. Function-ality for automatically adding users to the database user register when application users are created makes this reasonably easy to administrate. Also, in the database table UserInfo, application specific information about each user is stored. This in-cludes a numerical key (ID), the database Login to map the entry to and whether or not the user should have Administrator rights. The UserInfo record is linked to a group entry, telling which data subset to show when the user logs into the system. See figure 5.7.

Opinions diverge considerably regarding whether or not altering quality control results should be permitted once they have been registered into the system. From my point of view, some functionality to allow correcting obviously invalid data must be present, but only available to a restricted subset of users. The permissions entrusted various users are controlled by the UserPermission table. Each row grants a user access to the partitioning group specified by the GroupNumber. The PermissionType tells whether the user should have reading permission or full read/write permissions.

Users are commonly configured to have read access to all results. This could be accomplished by giving permissions for the root of the tree, and is a consequence of every group containing all its subgroups.

Ideally, the system should also provide full version history of all changes done to a result. This support has however not been implemented within the scope of the thesis.

5.3.7

Definition of Mentor Participation

The last groups of tables left to be described are those relating to the definition of sample IDs to be used when registering patient sample results for used with the mentor principle.

The definition involves the concept of a mentor group, which is used to denote a set of sample identities used for this split-sample testing at a particular site. All identities within the same mentor group have same master instrument assigned, and the coupled results are presented as the same target material. The master

(45)

5.4 Statistical Summaries 31

UserInfo

ID : int Login : String Administrator : boolean

Name : String GroupTree

GroupNumber : int GroupName : String 1 Parent 0..* 0..* ID 0..* GroupNumber UserPermission PermissionType : int 0..* 1 DefaultGroup

Figure 5.7. Data model for authorizing users to view or modify control data.

MentorGroup Code : int Name : String LID Method 0..* 1 Method 0..* 1 LID MentorLid LID : String 1 Code 0..* MentorResult Component : String Method : String LID : String TimeInfo : DateTime DoubleTime : double Conc : double Excluded : boolean Memo : String

Figure 5.8. How series of LIDs for use with the mentor principle are stored in the database.

instrument and target material are specified by the Method and LID columns of each record in MentorGroups. Each such mentor group can also be specified a name, this name is not used throughout the application. It can still be of help to the administrator if used to describe the contents of the group. Each mentor group is identified by a Code whose only purpose is to form a primary key for the entry. To each mentor group, a set of Mentor Lids is connected. These contain the actual laboratory identities the patient samples should be re-tagged with in order to belong to this group. This relationship is shown in figure 5.8.

The MentorResults table is used for intermediate storage of uncoupled results waiting to get a partner. When a result having one of the designated mentor IDs enters the system, it is copied to this table. See 5.2.3 for a detailed description on how the matching is performed.

Results are left in the table after coupling has been performed. This makes it possible to calculate statistics on the latency between adept and master measure-ments, and also the number of results never finding a partner, perhaps due to too long time interval, or samples getting lost.

5.4

Statistical Summaries

Statistical summaries can be generated from the collected data. This functionality is important in order to provide means for the user to interpret the results. Two

References

Related documents

Men när allt kommer omkring så handlar den här likhet- en inte om att de har svårt att skilja på könen, det vill säga misstar kvinnor för män, utan pro- blemet verkar vara

Master Thesis in Accounting 30 hp, University of Gothenburg School of Business, Economics and Law,

I analysed how variable was the ability of reproduction (seed production) trough outcrossing and selfing and whether this variation was related to differences in floral

“Det är dålig uppfostran” är ett examensarbete skrivet av Jenny Spik och Alexander Villafuerte. Studien undersöker utifrån ett föräldraperspektiv hur föräldrarnas

Collins wants to problematize the readers’ understanding of what a good and bad government actually is and she does that by blurring the dystopian and utopian lines in her trilogy

Naturhistoriska riksmuseet (The Swedish museum of Natural History) in Stockholm, Sweden is compared with the Ditsong National Museum of Natural History in Pretoria, and

The benefit of using cases was that they got to discuss during the process through components that were used, starting with a traditional lecture discussion

All of Malm¨o has witnessed a surge in property prices since the opening of the tunnel, and the catchment areas of the older stations have been central locations with