Measurement repeatability profiles of eight frequently requested measurands in clinical chemistry determined by duplicate measurements of patient samples

(1)

Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=iclb20

Scandinavian Journal of Clinical and Laboratory

Investigation

ISSN: 0036-5513 (Print) 1502-7686 (Online) Journal homepage: https://www.tandfonline.com/loi/iclb20

Measurement repeatability profiles of eight

frequently requested measurands in clinical

chemistry determined by duplicate measurements

of patient samples

Anders Kallner, Astrid Petersmann, Matthias Nauck & Elvar Theodorsson

To cite this article: Anders Kallner, Astrid Petersmann, Matthias Nauck & Elvar Theodorsson (2020): Measurement repeatability profiles of eight frequently requested measurands in clinical chemistry determined by duplicate measurements of patient samples, Scandinavian Journal of Clinical and Laboratory Investigation, DOI: 10.1080/00365513.2020.1716266

To link to this article: https://doi.org/10.1080/00365513.2020.1716266

Published online: 23 Jan 2020.

Submit your article to this journal

Article views: 211

View related articles

(2)

ORIGINAL ARTICLE

Measurement repeatability profiles of eight frequently requested measurands in

clinical chemistry determined by duplicate measurements of patient samples

Anders Kallnera, Astrid Petersmannb,c, Matthias Nauckc and Elvar Theodorssond

a

Department of Clinical Chemistry, Karolinska University Hospital, Stockholm, Sweden;bInstitut f€ur Klinische Chemie, Universit€atsmedizin, G€ottingen, Germany;cInstitute of Clinical Chemistry and Laboratory Medicine, University Medicine, Greifswald, Germany;dDepartment of Clinical Chemistry and Department of Clinical and Experimental Medicine, Link€oping University, Link€oping, Sweden

ABSTRACT

Measurement uncertainties in clinical chemistry are commonly regarded as heteroscedastic– having a constant relative standard deviation irrespective of the concentration of the measurand. The uncer-tainty is usually determined at two concentrations using stabilized control materials and assumed to represent the analytical goal. The purpose of the present study was to use duplicates of unselected patient samples to calculate the absolute and relative repeatability component of the intra-laboratory measurement uncertainty from duplicates, using the Dahlberg formula and analysis of variance com-ponents. Estimates were made at five different concentration intervals of ALT, AST, Calcium, Cholesterol, Creatinine, CRP, Triglycerides and TSH covering the entire concentration interval of the patient cohort. This partioning allows detailing their repeatability profiles. The calculations of the pro-files were based on randomly selected results from sets of duplicates ranging from 12,000 to 65,000 pairs. The repeatability of the measurands showed substantial variability within the measuring interval. Therefore, characterizing imprecision profiles as purely homo- or heteroscedastic or by a single num-ber may not be optimal for the intended use. The present data make a case for nuancing the evalu-ation of analytical goals and minimal differences of measurement results by establishing uncertainty profiles under repeatability conditions, using natural patient samples.

ARTICLE HISTORY Received 22 September 2019 Revised 12 November 2019 Accepted 25 December 2019 KEYWORDS Dahlberg uncertainty; reproducibility; analysis of variance components; ANOVA Introduction

Laboratories have an obligation, e.g. according to accredit-ation standards and legal requirements, to provide method performance criteria. Among other parameters, determin-ation of the uncertainty of the measurement method is required, prompting laboratories to report a standard devi-ation (s) or coefficient varidevi-ation (%CV, i.e. the standard deviation expressed relative to a specified value). The meas-urement uncertainty of laboratory results is important in their own rights and for their use in clinical decisions e.g. in comparing patient results with reference intervals or with previous results, i.e. in defining the minimal difference (MD) or reference change value (RCV). Metrologically, the uncertainty, expressed as absolute (s) or relative (%CV) standard deviation may differ within the measuring interval and a detailed account of the performance can be summar-ized in an uncertainty profile [1,2].

Measurement procedures are described as having either a constant absolute (s) or a relative (%CV) standard deviation in the measuring interval. These situations are called homo-scedastic – shows constant standard deviation and hetero-scedastic – does not show a constant standard deviation. In practice, however, measurement procedures are rarely purely

homoscedastic or heteroscedastic [3]. Consequently, the measurement uncertainty at a particular concentration can-not always be extrapolated from the available measurements of control materials.

Measurement uncertainty is the combined uncertainty of many sources, both physical and conceptual. Physical sour-ces may be volume, temperature, pre-analytical effects etc. whereas the ‘conceptual’ can be repeatability, reproducibility and combined, i.e. total or intra-laboratory uncertainty. Methods for appraising the physical uncertainty are described and formalized in the standard GUM, ISO-BIPM [4] document whereas the conceptual may be detailed in the scientific and professional literature [5–7].

The Dahlberg uncertainty, also recognized as the Dahlberg error or the Dahlberg factor, is calculated under ‘repeatability’ conditions and therefore addresses only a part of the combined uncertainty under stable analytical condi-tions. Recently, we described and explored the mechanism of calculating the standard deviation according to the Dahlberg procedure [6] and compared that estimate with standard deviations derived conventionally [7,8] and by means of analysis of variance components (ANOC). We concluded that the Dahlberg procedure offers a ‘best esti-mate’ of the uncertainty expressed as a weighted standard

CONTACTAnders Kallner anders.kallner@ki.se Department of Clinical Chemistry, Karolinska University Hospital, Stockholm SE 171 76 Sweden ß 2020 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.

(3)

deviation under repeatability conditions in a defined meas-uring interval. Conventional formulas provide the repeat-ability or intra-laboratory imprecision and the analysis of variance components method (ANOC) can identify the repeatability and reproducibility components of the intra-laboratory imprecision depending on the experimental design [9,10]. The Dahlberg method is commonly based on patient samples of varying concentrations analysed in dupli-cates, and the results will only be valid in the actual concen-tration interval. The conventional formula and the ANOC require that the same sample is used. If, however, a set of duplicates of varying concentration is analysed by the ANOC, the calculated repeatability profile will be valid and show the same result as the Dahlberg formula whereas the between group variance describes the distribution of the sample results and thus not the true reproducibility.

The ANOC has the advantage over the Dahlberg approach to allow several observations in each group rather than just duplicates since the mean squares are based on the sum of squares which can summarize any number of obser-vations. The relative repeatability imprecision which can be calculated according to Dahlberg should be based on the relative difference of each pair of duplicates, i.e. related to the average of the results of the pair [7]. It will therefore be relative to the average of the measured samples and not a predetermined value. A similar calculation might be difficult to achieve with the ANOC.

In the present study we calculated the repeatability pro-files for eight commonly requested quantities, in order to determine to what extent the profiles were homo- or hetero-scedastic. The data were previously accumulated during about one year for a study of the frequency of extreme dif-ferences between duplicate measurements [8]. The measure-ment procedure was based on duplicate measuremeasure-ments, repeats, performed adjacent to each other to ascertain that there is no or a negligible bias.

Method and materials

Recently, we reported the results of a study of the frequency of outliers and extreme differences identified by a large number of duplicate measurements of natural patient sam-ples [8]. Between about 12,000 and 64,000 duplicate values of eight frequently requested measurands were measured (Table 1). The data were collected from unselected patients but even so and foreseeable, there was an overrepresentation of results within the reference intervals and thus the distri-butions of the data were skewed.

All measurements were performed in the routine labora-tory in Greifswald which used three Dimension Vista 1500 instruments. Procedures and reagents were according to the manufacturers instruction (Siemens Healthcare Diagnostics, Eschborn, Germany). The measurements were monitored according to the Rili-BAEK [11,12] IQC procedures and only results from clinically accepted measurements were included in the study. To be included in the presently used database both results had to comply with the quality rules. For further details, see Neubig et al. [8].

The clinical samples were randomly measured by three different instruments and all results were pooled and sorted to remove extremes e.g. values close to the limit of quantita-tion (LoQ) and substantially above the upper reference limit (URL) since they would be outside the clinically relevant intervals. The remaining results were randomized, and two types of experiments performed.

1. The absolute and relative Dahlberg s and %CV were calculated from 1,000 pairs of randomized results. These were then partitioned as evenly as practical and possible into five partitions. The absolute and relative Dahlberg uncertainties were calculated for each parti-tion and displayed in graphs. The MD (repeat) was derived. This was repeated ten times and the median and interquartile intervals (IQR) calculated.

2. For each quantity, the total bulk of randomized results was partitioned according to the groups defined in experiment 1). Thirty groups of 25 results, in all 750 pairs, were selected for each of the studied quantities and the Dahlberg uncertainties calculated for each group. The results were summarized as the median and IQR for each partition and quantity. The results were displayed in scattergrams.

By re-randomizing the datasets, partitions could be selected that utilized different data for experiments (1) and (2) and thus limiting confounding by overlap. This also allowed confirming experiments when unexpected results were obtained.

The ANOC analysis was applied to five partitions of the measurands of the entire dataset. The number of observa-tions in each partition was in the order of 2,500 (P-TSH) to 23,000 (P-Creatinine) (Table 2) but was not further subdi-vided. The repeatability was directly calculated as the square root of the mean square obtained in the standard ANOVA table.

Table 1. Number of duplicate observations (rounded down to the nearest hundred) within the interval indicated in the parenthesis.

Measurand Number

Alanine amino transferase (ALT) (0.1–16.7 mkat/L) 16,000 Aspartate amino transferase (AST) (0.05–16.7 mkat/L) 21,700 Calcium (1.25–3.75 mmol/L) 64,600 Cholesterol (1.29–15.5 mmol/L) 12,000 Creatinine (9–1,700 mmol/L) 75,000

CRP (3.1–60 mg/L) 21,600

Triglycerides (0.02–11.3 mmol/L) 18,300 Thyroid stimulating hormone (TSH) (0.05–100 mU/L) 23,900

Table 2. Number observations in each of the partitions analyzed by the ANOC. Concentration intervals as inFigure 3.

ALT AST Calcium Cholesterol Creatinine CRP TG TSH 1 12,409 1,793 17,320 3,093 2,491 2,603 17,149 5,306 2 5,299 9,637 16,830 3,093 11,008 4,969 5,761 2,243 3 6,173 10,179 17,957 6,969 23,462 3,700 6,111 4,968 4 3,541 17,559 11,854 6,699 21,237 2,558 5,009 5,249 5 3,999 3,685 1,244 2,807 16,933 10,098 2,225 5,983 2 A. KALLNER ET AL.

(4)

Statistical procedures

The repeatability component of the measurement uncer-tainty was calculated by the Dahlberg procedure [13] using the formula s2_D ¼ PN i¼1d 2 i 2 N

where d is the difference between paired results and N is the number of pairs [6].

The relative standard deviation is based on the relative difference of each pair, i.e. the difference in relation to the average of the observations.

%CVD ¼ 100 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PN i¼1 2 xði1xi2Þ xi1þxi2 2 2 N v u u t

The %CVD calculated expresses the weighted repeatability coefficient of variation, relative to the average of the con-centrations in the measuring interval.

The minimal difference, MD, between measurements of two patient samples can be calculated at each partition as

MD ¼ sD k

ffiffiffi 2 p

where k, the coverage factor, was set to 2, conventionally representing a level of confidence of 95% and sD the best estimate of the repeatability.

An Excel spreadsheet program was developed to accom-modate up to 1,000 pairs of duplicates and used for experi-ment (1). This spreadsheet was designed to provide the results presented inTables 3 and4and the adhering graphs (Figures 1 and 2). It allowed interactive adjustments of the partitions. A separate spreadsheet was used to calculate and summarize the median and IQR of the sDand %CVD of the

30 groups and presented in Table 4 and Figure 2,

respectively.

The ANOC is based on a standard ANOVA table. The mean square (MS) of the within group equals the within group variance and can be shown to be equal the Dahlberg expression. The relative standard deviation (%CV) was cal-culated using the average of the corresponding partition. Results

The Dahlberg standard deviation and coefficients of vari-ation were displayed together in scatter diagrams with either the partition number or concentration as the independent variable (Figures 1and 2). The s was presented on the pri-mary y-axis and the %CVD on the secondary ordinal axis. The partition number was used in presenting results of experiment 1 whereas the concentration was used in experi-ment 2. The concentrations used in the graphs of Figure 2 are the middle of the partition interval (average of the upper and lower limit). The notion of the highest value was

Table 3. The medians and 25- and 75 percentiles of results calculated from 10 repeated selections of 1,000 unique results. The minimal differences are based on the repeatability standard deviation in each partition.

ALT AST Calcium Cholesterol

Conc Freq Median 25-P 75-P Conc Freq Median 25-P 75-P Conc Freq Median 25-P 75-P Conc Freq Median 25-P 75-P SD 0.3 36.6 0.012 0.012 0.013 0.2 14.4 0.015 0.013 0.016 2.1 26.3 0.023 0.022 0.023 3 12.9 0.06 0.05 0.06 0.4 17.3 0.013 0.012 0.014 0.3 25.0 0.014 0.014 0.015 2.15 11.5 0.022 0.021 0.022 4.5 28.2 0.08 0.07 0.08 0.6 19.1 0.013 0.013 0.014 0.4 19.3 0.015 0.014 0.016 2.25 29.6 0.022 0.022 0.022 5.5 25.6 0.08 0.08 0.09 0.9 10.9 0.014 0.013 0.015 0.7 22.5 0.017 0.015 0.018 2.45 29.8 0.024 0.022 0.024 6.9 25.6 0.10 0.10 0.11 15 16.1 0.029 0.026 0.036 20 18.8 0.025 0.021 0.033 3.2 2.9 0.028 0.028 0.030 10 7.7 0.13 0.12 0.13 %CV 6.1 5.9 6.5 10.1 9.3 11.4 1.2 1.1 1.2 2.2 2.2 2.3 3.9 3.5 4.2 5.9 5.8 6.1 1.0 1.0 1.0 1.9 1.8 2.0 2.7 2.6 3.0 4.3 4.1 4.7 1.0 1.0 1.0 1.7 1.6 1.8 2.0 1.8 2.1 3.5 3.2 3.9 1.0 1.0 1.0 1.6 1.6 1.7 1.2 1.2 1.3 1.7 1.5 2.1 1.1 1.1 1.2 1.7 1.7 1.8 MD 0.02 0.0 0.03 0.03 0.0 0.03 0.05 0.0 0.05 0.1 0.1 0.1 0.03 0.0 0.03 0.03 0.0 0.03 0.04 0.0 0.04 0.2 0.1 0.2 0.03 0.0 0.03 0.03 0.0 0.03 0.04 0.0 0.04 0.2 0.2 0.2 0.03 0.0 0.03 0.03 0.0 0.04 0.05 0.0 0.05 0.2 0.2 0.2 0.06 0.1 0.07 0.05 0.0 0.07 0.06 0.1 0.06 0.3 0.2 0.3 Creatinine CRP Triglycerides TSH SD 50 7.7 2.5 2.3 2.8 7 20.4 0.5 0.3 0.5 1.1 19.6 0.11 0.07 0.16 0.7 20.5 0.01 0.01 0.01 70 25.6 2.6 2.6 2.8 12 20.1 0.8 0.8 0.9 1.4 20.4 0.06 0.05 0.08 1.1 19.4 0.02 0.01 0.02 85 22.9 2.8 2.7 2.9 20 21.7 1.0 0.9 1.1 1.8 21.5 0.04 0.04 0.05 1.55 20.6 0.02 0.02 0.02 110 21.8 2.9 2.8 3.0 35 19.3 0.9 0.8 1.1 2.6 19.9 0.11 0.09 0.23 2.75 21.7 0.03 0.03 0.03 500 22.0 3.2 3.1 3.2 75 18.5 1.3 1.3 1.5 10 18.6 0.18 0.11 0.20 20 17.9 0.11 0.11 0.12 %CV 6.4 6.0 7.0 7.3 5.9 7.4 6.2 4.8 7.3 1.8 1.7 1.9 4.4 4.2 4.4 7.9 7.6 8.4 3.8 3.2 4.9 1.7 1.6 1.7 3.6 3.5 3.7 6.9 6.3 7.6 2.6 2.5 3.3 1.6 1.6 1.7 3.0 3.0 3.1 3.3 2.8 3.8 6.0 4.0 7.3 1.6 1.5 1.7 2.0 1.9 2.1 2.8 2.6 3.3 4.4 1.9 5.7 1.6 1.6 1.6 MD 5.0 4.6 5.6 0.9 0.7 1.1 0.21 0.1 0.32 0.02 0.0 0.02 5.2 5.2 5.5 1.6 1.5 1.8 0.12 0.1 0.16 0.03 0.0 0.03 5.5 5.4 5.8 1.9 1.8 2.1 0.09 0.1 0.11 0.04 0.0 0.04 5.8 5.7 5.9 1.7 1.5 2.1 0.22 0.2 0.45 0.07 0.1 0.07 6.3 6.2 6.4 2.6 2.5 3.0 0.36 0.2 0.41 0.23 0.2 0.23

(5)

reduced to improve readability in the graphs of ALT, AST, Creatinine and TSH in Figure 2. Therefore, the slope of the connecting line in the last partition is not correctly shown. This approximation was not necessary when the partition number was used since these were represented by equal divisions of the axis.

The profiles of the enzymes AST and ALT showed practic-ally identical uncertainty profiles in both experiments and similar patterns were found for Creatinine, Triglycerides and TSH. A similar tendency was obtained for Cholesterol, whereas the profile of CRP was irregular with a heap covering the upper reference limit. The Calcium profile also showed a decrease in both sD and %CVD from the very low concentra-tions and then small increases. The imprecision profiles based on the ANOC analysis (Figure 3) of the entire dataset were generally the same as obtained by the experiments.

Discussion

Measurement uncertainty/imprecision profiles in clinical chemistry were highlighted in the early days of competitive immunoassays [1,14,15]. The four-parameter logistic (sig-moid) calibration functions used in those procedures described the data particularly well and is widely used to limit and devise the measuring interval. Imprecision profiles have been much less, or hardly at all, discussed for substrate or endpoint reactions where the calibration functions com-monly are linear.

The profession has spent considerable efforts searching for optimal methods to calculate‘analytical goals’. Dedicated conferences in Stockholm 1999 [16] and in Milan 2014 [17,18] have been launched for the purpose. Subsequent to these conferences, procedures for establishing analytical goals were agreed. High on the hierarchal scale of recom-mendations were those based on measurement uncertainty

and the biological variation. Less attention has been paid to how the recommended analytical goals should be assessed in the laboratory. However, quality manuals e.g. the inter-national accreditation standard (ISO 15189) and the German laboratory quality system Rili-BAEK [11,12] require that control materials of two different concentrations are used to monitor the measurement precision and trueness. Commonly, the imprecision, e.g. the average relative stand-ard deviation (%CV), calculated over time, of the two stabi-lized control materials is presented as the uncertainty of the measurement procedure whether the uncertainty profile is known or not. This is thus based on the reproducibility or intra-laboratory imprecision.

In case the relative standard deviation is constant throughout the measurement interval or the measurement uncertainty at the concentration which is best discerning between health and disease is used, the current practice is appropriate. However, for some measurands the absolute standard deviation is constant in a wide concentration inter-val and is likely to illustrate the performance of a measure-ment more appropriately. For instance, in the present report the uncertainty profile for AST shows a fourteen-fold decrease of the %CV whereas the increase of s was only two-fold (Figures 1 and 2). On the other hand, cholesterol showed a 2.5-fold increase of the s and a 1.5-fold decrease of the %CV within the measuring interval (Figures 1and2). It was not generally possible to classify a method’s uncer-tainty profile as either homo- or heteroscedastic although specific intervals could be identified in which the standard deviation (Creatinine) or the coefficient of variation were almost constant (TSH) or both (Calcium), whereas the pro-files for ALT, AST, Cholesterol and CRP could not be sens-ibly fitted to either.

The access to the very large database of duplicate meas-urements allowed utilizing the entire material to analysis of

Table 4. The medians and interquartile intervals were based on 30 groups of 750 pairs of observations after randomization of the initial dataset.

ALT AST Calcium Cholesterol

Conc Median 25-perc 75 perc Conc Median 25-perc 75 perc Conc Median 25-perc 75 perc Conc Median 25-perc 75 perc SD 0.2 0.011 0.010 0.013 0.07 0.015 0.013 0.018 2.1 0.023 0.021 0.025 3.0 0.055 0.051 0.061 0.35 0.012 0.011 0.013 0.2 0.013 0.013 0.016 2.15 0.023 0.020 0.024 3.7 0.068 0.061 0.074 0.5 0.012 0.011 0.013 0.3 0.013 0.012 0.016 2.25 0.023 0.021 0.025 5.0 0.077 0.065 0.088 0.75 0.013 0.012 0.015 0.75 0.015 0.013 0.019 2.45 0.026 0.024 0.030 6.3 0.102 0.081 0.115 1.8 0.021 0.018 0.026 1.8 0.025 0.019 0.034 2.8 0.030 0.028 0.031 6.9 0.127 0.115 0.145 %CV 0.2 5.90 5.23 6.31 0.07 12.57 11.77 15.12 2.1 1.14 1.03 1.28 3.0 2.21 1.97 2.37 0.35 4.60 4.41 5.35 0.2 6.50 5.90 7.79 2.15 1.06 0.93 1.13 3.7 1.74 1.55 1.92 0.5 3.26 2.80 3.42 0.3 4.49 4.01 5.31 2.25 0.97 1.06 1.14 5.0 1.56 1.32 1.76 0.75 2.17 1.90 2.51 0.75 2.89 2.45 3.75 2.45 1.06 0.99 1.23 6.3 1.62 1.35 1.89 1.8 1.29 1.21 1.39 1.8 0.90 0.86 1.03 2.8 1.14 1.03 1.20 6.9 1.74 1.54 1.96 Creatinine CRP Triglycerides TSH SD 30 2.315 2.112 2.889 5.0 0.242 0.195 0.290 1.5 0.041 0.029 0.068 0.7 0.008 0.007 0.009 45 2.674 2.287 2.990 7.5 0.405 0.333 0.552 1.8 0.038 0.028 0.052 0.9 0.014 0.012 0.016 70 2.585 2.260 2.918 12.5 0.827 0.607 0.964 2.5 0.042 0.038 0.076 1.3 0.021 0.018 0.023 95 2.668 2.437 2.943 17.5 0.856 0.660 1.034 3.5 0.048 0.041 0.076 2.1 0.030 0.027 0.034 110 2.977 2.628 3.565 20.0 1.035 0.788 1.393 4.0 0.066 0.061 0.090 4.0 0.107 0.077 0.130 %CV 30 7.14 6.44 8.20 5.0 5.59 4.72 6.69 1.5 3.81 2.72 4.75 0.7 1.83 1.62 1.95 45 5.36 4.65 5.94 7.5 5.27 4.51 6.80 1.8 2.18 1.65 2.99 0.9 1.62 1.39 1.72 70 3.73 3.34 4.16 12.5 6.83 4.84 8.41 2.5 1.71 1.57 3.69 1.3 1.57 1.42 1.78 95 2.90 2.62 3.24 17.5 5.39 3.94 6.63 3.5 1.42 1.23 2.37 2.1 1.53 1.38 1.65 110 1.86 1.73 2.08 20.0 2.52 2.15 3.48 4.0 1.29 1.18 1.61 4.0 1.54 1.33 1.79 4 A. KALLNER ET AL.

(6)

variance components. In those experiments each sample – consisting of two measurements – was regarded as a group in the ANOVA setup. The material was partitioned as in the general experiment of the present study. The mean square of the ‘within group’ is equivalent to the Dahlberg variance [7] and liable to the same limitations. The general patterns of the imprecision profiles were comparable with those obtained in the experimental setup with randomly selected results. However, this approach will not give the intra-laboratory imprecision since the‘between group’ mean square will essentially represent the variance of the distribu-tion of the sample results. It therefore has no place in

elucidation of the imprecision profile and only the square root of the within group mean square are reported.

The commutability of the materials used in estimating measurement uncertainty is crucial. Hage-Sleiman et al. [19] showed that for troponin I, a systematically higher %CV, 13%, was obtained when stabilized control materials were used for estimating measurement uncertainty than when natural patient samples, 4%, were used. Similar effects have been cautioned by Sadler [15,20]. The results of the present study, which by definition uses fully communicable samples, i.e. patient samples, are therefore likely to illustrate the gen-eral pattern of the performance of the measurement

0 1 2 3 4 5 6 7 0.00 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0 1 2 3 4 5 %CV SD Paron ALT SD %CV 0 2 4 6 8 10 12 0.00 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0 1 2 3 4 5 %CV SD Paron AST SD %CV 0 0.2 0.4 0.6 0.8 1 1.2 1.4 0.00 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0 1 2 3 4 5 %CV SD Paron Calcium SD %CV 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0 1 2 3 4 5 %CV SD Paron Cholesterol SD %CV 0 1 2 3 4 5 6 7 8 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 0 1 2 3 4 5 %CV SD Paron CRP SD %CV 0 1 2 3 4 5 6 7 8 9 10 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 0 1 2 3 4 5 %CV SD Paron Creanine SD %CV 0 1 2 3 4 5 6 7 8 0.00 0.05 0.10 0.15 0.20 0.25 0 1 2 3 4 5 %CV SD Paron Triglycerides SD %CV 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0 1 2 3 4 5 %CV SD Paron TSH SD %CV

Figure 1.Repeatability profiles based on 10 sets of1,000 duplicates. Absolute uncertainty (SS) are shown on the left Y-axis, relative uncertainty on the right axis. The X-axis shows the partitions (1–5), the corresponding concentration intervals are reported inTable 2. Error bars represent the IQR.

(7)

methods used. The transferability of the absolute and rela-tive uncertainty can only be judged by complying with requirements of the internal quality system. Although the study period was about one year, only values which were released to the clinic and thus fulfilling the quality require-ments were used.

Raggatt [21] used duplicates for estimating both ‘catastrophic errors’ and detecting outliers in addition to cal-culating precision profiles in immunoassays. Subsequent stud-ies have had a primary focus on precision-/imprecision profiles in the context of measurement uncertainty [2,20–27].

The Dahlberg formula can be derived and understood in many ways. In a previous report [4] the origin of the

Dahlberg variance and its formal and practical relation to repeatability variance was discussed. The degrees of freedom in the Dahlberg calculation is the same as the number of duplicate pairs, which is equal to the degrees of freedom of the within-group variance in the ANOC approach.

The experimental setting for an analysis according to Dahlberg does not consider the between series variance and can therefore not be used for calculating the corresponding MD or RCV. However, it is important to recognize that these diagnostic aids may take different values if based on the repeatability or the intra-laboratory variation. The repeatability imprecision is generally smaller that the intra-laboratory imprecision and therefore signals a higher

0 1 2 3 4 5 6 7 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0 0.3 0.6 0.9 1.2 1.5 1.8 %CV SD Concentraon ALT SD %CV 0 2 4 6 8 10 12 14 16 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0 0.3 0.6 0.9 1.2 1.5 1.8 %CV SD Concentraon AST Sd %CV 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 2.00 2.20 2.40 2.60 2.80 3.00 %C V SD Concentraon Calcium SD %CV 0 0.5 1 1.5 2 2.5 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 2 3 4 5 6 7 8 %CV SD Concentraon Cholesterol SD %CD 0 1 2 3 4 5 6 7 8 9 0.000 0.500 1.000 1.500 2.000 2.500 3.000 3.500 4.000 20 40 60 80 100 120 %CV SD Concentraon Creanine SD %CV 0 1 2 3 4 5 6 7 8 9 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 0 5 10 15 20 25 %CV SD Concentraon CRP SD %CV 0 1 2 3 4 5 0.00 0.02 0.04 0.06 0.08 0.10 1 1.5 2 2.5 3 3.5 4 4.5 %CV SD Concentraon Triglycerides SD %CV 0 0.5 1 1.5 2 2.5 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0 1 2 3 4 %CV SD Concentraon TSH SD %CV

Figure 2. Repeatability profiles based on 30 sets of 25 duplicates. Absolute uncertainty (SD) are shown on the left Y-axis, relative uncertainty on the right axis. The X-axis shows the average concentration of the partitions. The concentration of the fifth partition has been selectively reduced to make the diagrams more readable. Error bars represent the IQR.

(8)

analytical sensitivity that the reproducibility. The differenti-ation of the imprecision profiles is important in inferring laboratory data in a rapid progressing disease, e.g. in inten-sive care, or chronic diseases or screening situations [6].

We have used data from a large cohort to demonstrate the effects. A reliable estimate of the repeatability uncer-tainty using the Dahlberg approach can, however, be obtained using about 25 pairs of duplicates [7].

Conclusions

Duplicate measurements can be used in the Dahlberg or ANOC to calculate the repeatability and the repeatability

profile. To calculate both the repeatability and reproducibil-ity and their profiles an experimental design must provide measurements under both conditions, i.e. several runs with repeated measurements using the same sample. In case ANOC analysis is applied to a series of IQC results add-itional information can be obtained, e.g. about differences between reagent lots, measurement systems or laboratories. Taken together, our results indicate that repeatability impre-cision profiles calculated from duplicate measurements of natural patient samples within the intervals of concentration encountered in a laboratory may provide nuanced repeat-ability imprecision profiles compared to those provided by a few observations using stabilized control materials. It is 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 0.01 0.01 0.01 0.02 0.02 0.02 0.02 0.02 0 1 2 3 4 5 6 %CV SD Paron ALT SD %CV 0.00 2.00 4.00 6.00 8.00 10.00 12.00 0.00 0.01 0.01 0.02 0.02 0.03 0.03 0 1 2 3 4 5 6 %CV SD Paron AST Sd %CV 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 0.00 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0.05 0 1 2 3 4 5 6 %CV SD Paron Calcium SD %CV 0.0 1.0 2.0 3.0 4.0 5.0 6.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0 1 2 3 4 5 6 %CV SD Paron Creanine Sd %CV 0.00 0.50 1.00 1.50 2.00 2.50 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0 1 2 3 4 5 6 %CV SD Paron Cholesterol SD %CV 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0 1 2 3 4 5 6 %CV SD Paron Trigycerides Sd %CV 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 0 1 2 3 4 5 6 %CV SD Paron CRP SD %CV 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0 1 2 3 4 5 6 %CV SD Paron TSH Sd %CV

(9)

emphasized that the repeatability which is obtained by the procedures presented in the present report do not represent the method imprecision except when used in similar repeat-ability conditions i.e. repeated measurement within a short time e.g. intensive care.

Disclosure statement

No potential conflict of interest was reported by the authors.

Funding

This study was supported by the Karolinska university laboratory and the County council of €Osterg€otland.

References

[1] Ekins RP. The precision profile: its use in assay design, assess-ment and quality control. In: Hunter WM, Corrie JT, editors. Immunoassays for clinical chemistry. London: Churchill Linvingstone; 1983. p. 111–122.

[2] Sadler WA, Smith MH. Use and abuse of imprecision profiles – some pitfalls illustrated by computing and plotting confi-dence-intervals. Clin Chem. 1990;36(7):1346–1350.

[3] Hyslop NP, White WH. Estimating precision using duplicate measurements. J Air Waste Manage Assoc. 2009;59:1032–1039. [4] JCGM. Evaluation of measurement data— guide to the

expres-sion of uncertainty in measurement. JCGM. 2008;100. GUM 1995 with minor corrections. [cited 2019 Dec 27].http://www. bipm.org/utils/common/documents/jcgm/JCGM_100_2008_E. pdf

[5] Oosterhuis WP, Bayat H, Armbruster D, et al. The use of error and uncertainty methods in the medical laboratory. Clin Chem Lab Med. 2018;56(2):209–219.

[6] Kallner A, Theodorsson E. Repeatability imprecision from ana-lysis of duplicates of patient samples. Scand J Clin Lab Invest Accepted for Publ. 2019.

[7] Kallner A, Theodorsson E. An experimental study of methods for the analysis of variance components in the inference of laboratory information. Scand J Clin Lab Invest Accepted for Publ. 2019.

[8] Neubig S, Grotevendt A, Kallner A, et al. Analytical robustness of nine common assays: frequency of outliers and extreme dif-ferences identified by a large number of duplicate measure-ments. Biochem Med. 2017;27:192–198.

[9] Aronsson T, Groth T. Nested control procedures for internal analytical quality control. Theoretical design and practical evaluation. Scand J Clin Lab Invest Suppl. 1984;172:51–64. [10] ISO/DIS 5725. Accuracy (trueness and precision) of

measure-ment methods and results - Part 2 Basic method for the deter-mination of repeatability and reproducibility of a standard measurement method. ISO Geneva. 1994.

[11] Bundes€arztekammer. Neufassung der “Richtlinie der Bundes€arztekammer zur Qualit€atssicherung laboratoriumsmedi-zinischer Untersuchungen – Rili-B€AK”. Bundes€arztekammer. [cited 2019 Dec 27]. https://www.bundesaerztekammer.de/fil- eadmin/user_upload/downloads/pdf-Ordner/RL/Rili-BAEK-Laboratoriumsmedizin.pdf

[12] Revision of the “Guideline of the German Medical Association on Quality Assurance in Medical Laboratory Examinations – Rili-BAEK” (unauthorized translation). Laboratoriumsmedizin. 2015;39:26–69.

[13] Dahlberg G. Statistical methods for medical and biological stu-dents. London: G. Allen & Unwin Ltd.; 1940.

[14] Rodbard D. Statistical quality control and routine data process-ing for radioimmunoassays and immunoradiometric assays. Clin Chem. 1974;20(10):1255–1270.

[15] Sadler WA. Imprecision profiling. Clin Biochem Rev. 2008; 29(Suppl 1):S33–S36.

[16] Fraser CG, Kallner A, Kenny D, et al. Introduction: strategies to set global quality specifications in laboratory medicine. Scand J Clin Lab Invest. 1999;59(7):477–478.

[17] Panteghini M, Sandberg S. Defining analytical performance specifications 15 years after the Stockholm conference. Clin Chem Lab Med. 2015;53(6):829–832.

[18] Sandberg S, Fraser CG, Horvath AR, et al. Defining analytical performance specifications: consensus Statement from the 1st Strategic Conference of the European Federation of Clinical Chemistry and Laboratory Medicine. Clin Chem Lab Med. 2015;53(6):833–835.

[19] Hage-Sleiman M, Capdevila L, Bailleul S, et al. High-sensitivity cardiac troponin-I analytical imprecisions evaluated by internal quality control or imprecision profile. Clin Chem Lab Med. 2019;57(4):E49–E51.

[20] Sadler WA, Smith MH, Legge HM. A method for direct estima-tion of imprecision profiles, with reference to immunoassay data. Clin Chem. 1988;34(6):1058–1061.

[21] Raggatt PR. Duplicates or singletons – an analysis of the need for replication in immunoassay and a computer-program to calculate the distribution of outliers, error rate and the preci-sion profile from assay duplicates. Ann Clin Biochem. 1989; 26(1):26–37.

[22] Berweger CD, Muller-Plathe F, Hanseler E, et al. Estimating imprecision profiles in biochemical analysis. Clin Chim Acta. 1998;277(2):107–125.

[23] Gonzalez AG, Herrador MA. Accuracy profiles from uncer-tainty measurements. Talanta. 2006;70:896–901.

[24] Gonzalez AG, Herrador MA. A practical guide to analytical method validation, including measurement uncertainty and accuracy profiles. TRAC-Trends Anal Chem. 2007;26:227–228. [25] Kenward MG. A method for comparing profiles of repeated

measurements. Appl Statist. 1987;36(3):296–308.

[26] Lee KY, Yanagisawa Y, Spengler JD, et al. Assessment of preci-sion of a passive sampler by duplicate measurements. Environ Int. 1995;21(4):407–412.

[27] Macarthur R, Feinberg M, Bertheau Y. Construction of meas-urement uncertainty profiles for quantitative analysis of genet-ically modified organisms based on interlaboratory validation data. J Aoac Int. 2010;93(3):1046–1056.