© L R Pendrill 180207 BEMC Berlin
1
How to calibrate a questionnaire: quality-assuring
categorical data with psychometric measurement theory
L R Pendrill
Research Institutes of Sweden, Metrology,
Eklandagatan 86, 41261 Göteborg (SE),
phone:+46 767 88 54 44, mailto:leslie.pendrill@ri.se
2
© L R Pendrill 180207 BEMC Berlin
3
Man as Measurement Instrument: Counting
’M dots’
http://itre.cis.upenn.edu/~myl/CommuniqueMundurucuENG.pdf
4
Measuring Man:
- Status, function of person
- Test against specifications
Man as Measurement Instrument:
- Perception of product/service
function, comfort etc
- Propose improvements in product
Man as Measurement Instrument
© L R Pendrill 180207 BEMC Berlin
5
Quality-assured categorical
measurement
6
© L R Pendrill 180207 BEMC Berlin
7
Tukey [Chapter 8, Data analysis and behavioural science; quoted in “The collected works of John A Tukey, Volume III, Philosophy and principles of data analysis: 1949 – 1964”, ed. L V Jones, Univ. North Carolina, Chapel Hill
Counted fractions
”Beware of attempts to interpret correlations between ratios
whose numerators and demoninators contain common parts”
[Pearson 1897]
%
%
1
K k k j jX
X
X
Relative-number problems:
• Counting (sheep & goats)
• How many affected at this dose
• How many of the pebbles are quartz…
jX
j
8
Different scales of measurement
6
3
2
1
http://www.123rf.com’Counted fractions’
%
%
1
K k k j jX
X
X
© L R Pendrill 180207 BEMC Berlin
9
Logistic ruling
z
P
P
success
success
1
log
’Counted fractions’
10
Probabi
lity
K
k
k
c
k
c
p
P
q
1
,
P
c,k= probability of classification c
when true level is k
p
k= a priori probability that true level is k
Counted fractions
n
i
i
i
q
q
1
1
;
0
n
i
i
i
y
q
E y
C
y
b
y
b
i
c ie
e
q
b = Lagrange multiplier
J M Linacre 2002, ” Optimizing Rating Scale Category Effectiveness”, Journal of Applied Measurement , 3:1 pp.85-106
y
1,
y
2,...,
y
Cy
cR
© L R Pendrill 180207 BEMC Berlin
11
Kaffe.11
W P Fisher Jr. 1999
)
Difficulty
(
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠
=
𝑒
𝜃−𝛿
1 + 𝑒
𝜃−𝛿
12
Wright B.D. (1997) Fundamental measurement for outcome evaluation. Physical
medicine and rehabilitation : State of the Art Reviews. 11(2) : 261-288
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠
=
𝑒
𝜃−𝛿
1 + 𝑒
𝜃−𝛿
© L R Pendrill 180207 BEMC Berlin
14
© L R Pendrill 180207 BEMC Berlin
15
PT1 PT2 PT3 PT4 PT5 PT6
PT7 PT8 PT9 PT10 PT11 PT12 PT13 PT14 PT15 ∑P T
RMI-1
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
2
RMI-2
0
1
0
0
1
0
1
0
0
0
0
0
0
0
0
3
RMI-3
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
RMI-4
0
1
0
0
1
0
1
0
1
0
1
1
0
0
0
6
RMI-5
0
1
0
0
1
0
1
1
1
1
1
1
0
0
0
8
RMI-6
0
1
0
0
1
0
1
0
0
0
1
0
0
0
0
4
RMI-7
0
1
0
0
1
1
1
1
1
1
1
1
0
0
0
9
RMI-8
0
1
0
0
1
0
1
0
1
0
1
0
0
0
0
5
RMI-9
0
1
1
0
1
1
1
1
1
1
1
1
0
0
0
10
RMI-10
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
12
RMI-11
0
1
1
1
1
1
1
1
1
1
1
1
0
0
0
11
RMI-12
1
1
1
1
1
1
1
1
1
1
1
1
0
1
0
13
RMI-13
0
1
0
0
1
0
1
0
1
1
1
1
0
0
0
7
RMI-14
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
14
RMI-15
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
15
∑RMI
4
14
6
5
13
7
15
8
11
9
12
10
2
3
1
How does this work?
16
underestimated
7.
Ac
hiev
e
3
.
Length
Better
quality
Worse
quality
𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝐶𝑇𝑇
𝑖
=
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
𝐿
𝑗=1
𝐿
𝑙𝑛
1−𝑃
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
=𝜃
𝑖
− 𝛿
𝑗
© L R Pendrill 180207 BEMC Berlin
17
Measurement and ordinal data
Required
- epoch
(/Myear)
Measured
- depth (/cm)
https://cdn-assets.answersingenesis.org/img/articles/ee/v2/geological-layers.jpg18
My
health?
Quality-assured measurement
Cognitive ability?
0,8 units ± 0,2 units
Object: Health
Person-centred care (PCC)
• Focus on health (not
illness)
• People partners in care
• More symptoms
• Impact on Activities of
Daily Living
• Subjective & perceptive
• …
© L R Pendrill 180207 BEMC Berlin
19
OECD Health Statistics 2015, http://dx.doi.org/10.1787/health_glance-2015-graph191-en
Variation in Primary Care
Indicators
Potential causes of
variation:
•
Disease prevalence
•
How physicians diagnose
•
How data coders interpret
diagnoses
© L R Pendrill 180207 BEMC Berlin
21
http://en.wikipedia.org/wiki/Bulletproof_vest)
uality
q
(
leniency)
(
)
bility
a
(
)
hallenge
c
(
n)
penetratio
(
(
resistance
)
Rasch (1963)
𝑙𝑛
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
1−𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
=𝜃
𝑖
− 𝛿
𝑗
22
Example
• i
th
person of ability
i
faced with task of
δ
j
level of difficulty
• probability, P
success
, of achieving task
Category attribute
Object attribute,
δ
j
Person characteristic,
i
Satisfaction
Quality of product
User leniency
Difficulty
Level of difficulty of
activity
(Dis-)ability
Accessibility
Accessibility of
transport mode
Utility (or net benefit, …)
Rasch psychometric model
Logistic regression
)
bility
a
(
)
difficulty
(
General linearised model, link function, z:
𝑙𝑛
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
© L R Pendrill 180207 BEMC Berlin
23
Rasch (1961)
Level of ability
Level of difficulty
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠
=
𝑒
𝜃−𝛿
1 + 𝑒
𝜃−𝛿
𝜃 𝐴𝑏𝑖𝑙𝑖𝑡𝑦
𝛿 𝐷𝑖𝑓𝑓𝑖𝑐𝑢𝑙𝑡𝑦
24
Construct
map:
Vision
functionality
Pesudovs 2010
Man as Measurement Instrument
© L R Pendrill 180207 BEMC Berlin
25
26
Person
Tool
Task
Task - tool
Environment*
• Body structures*
• Body functions*
Person – tool – task
• Activity*
• Participation*
*Five components of health:
[International Classification of Functioning, Disability and Health (ICF)]
© L R Pendrill 180207 BEMC Berlin
27
underestimated
7. Sports
motiv
1. Healthy
Less
difficult
More
difficult
𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝐶𝑇𝑇
𝑖
=
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
𝐿
𝑗=1
𝐿
𝑙𝑛
1−𝑃
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗
=𝜃
𝑖
− 𝛿
𝑗
28
)
Difficulty
(
)
Ability
(
Rasch (1961)
Measuring People
•
Correct ordinal data treatment
•
Better resolution
© L R Pendrill 180207 BEMC Berlin
29
Balance as Measurement Instrument - Sensitivity (C
)
R = C·S
+ ”additional terms”
Stimulus (
S
): Mass of weight
Response (
R
):
Mass of weight x
Balance sensitivity
R
Measurand ’restitution’, S = C
cal
-1
·R
Calibration
R
cal
= C
cal
·S
cal
+ ”additional terms”
30
Man as Measurement Instrument - Sensitivity (C
)
R = C·S
+ ”additional terms”
Stimulus (
S
): Task difficulty
R
)
Ability
(
Measurand ’restitution’, S = C
-1
·R
Measurement systems
Response (
R
):
Task difficulty x
’Instrument’ sensitivity
𝑃
𝑠𝑢𝑐𝑐𝑒𝑠𝑠
=
𝑒
𝜃−𝛿
1 + 𝑒
𝜃−𝛿
𝛿 𝐷𝑖𝑓𝑓𝑖𝑐𝑢𝑙𝑡𝑦
© L R Pendrill 180207 BEMC Berlin
31
Metrological
references
Difficul
ty
Mas
s
Tasks
Physical disability
𝛿 𝐷𝑖𝑓𝑓𝑖𝑐𝑢𝑙𝑡𝑦
32
Adapted from: Cano, Hobart IOMW 2014
Functional Independence Measure (FIM)
Barthel Index (BI)
)
(
A
bility
of
patient
© L R Pendrill 180207 BEMC Berlin
33
N
TP
= 52
Less
able
More
able
𝜃 𝐴𝑏𝑖𝑙𝑖𝑡𝑦
TP32 +
TP42
TP47
k
= 2
P
su
cce
ss
=
50%
P
su
cce
ss
=
18%
P
su
cce
ss
=
98%
34
P
success
= 18%
P
success
= 98%
© L R Pendrill 180207 BEMC Berlin
35
)
(
A
bility
of
patient
36
My
health?
Quality-assured measurement
Cognitive ability?
0,8 units ± 0,2 units
Object: Health
Person-centred care (PCC)
• Focus on health (not
illness)
• People partners in care
• More symptoms
• Impact on Activities of
Daily Living
• Subjective & perceptive
• …
© L R Pendrill 180207 BEMC Berlin
37
NeuroMet
EMPIR 15HLT04: Innovative measurements for improved
diagnosis and management of neurodegenerative diseases
June 2016 – June 2019
Acknowledgments
The European Metrology Programme for Innovation & Research (EMPIR, Horizon2020, Art. 185) is jointly funded by the EMPIR participating countries within EURAMET (www.euramet.org) and the European Union in this EMPIR 15 HLT04 NeuroMet project (coordinator: LGC (UK))
38
Less able
More able
U nce rt a in Po s s ib le Proba bl e𝑙𝑛
𝑃𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗 1−𝑃=𝜃
𝑖− 𝛿
𝑗Sum
𝐶
𝑇𝑇
𝑖
=
𝑃
𝑠𝑢𝑐𝑐𝑒
𝑠𝑠
,𝑖
,𝑗
𝐿 𝑗=
1
© L R Pendrill 180207 BEMC Berlin
39
Less difficult
More difficult
N
amin
g obj
ects
D
el
aye
d r
eca
ll
40
Metrological
references
Task
difficulty,
δ
Diffic
ul
ty
Mas
s
Naming objects
Orientation
Delayed recall
© L R Pendrill 180207 BEMC Berlin
41
PLOS ONE | DOI:10.1371/journal.pone.0162889 October 14, 2016
Under-estimate
Mini Mental State Examination
More able
Less able
Healthy
Mild
cognitive
impairment
AD
© L R Pendrill 180207 BEMC Berlin
42
PROPERTY CTT IMPLICATION RMT IMPLICATION
Group-level summary statistics legitimate YES Mean (SD) achievable for samples YES Mean (SD) achievable for samples
Item /person parameters can be estimated separately
NO Equivalence of measurement across not
guaranteed
YES All patients scored within an equivalent frame of reference
Total score is a sufficient statistic NO Different patterns of item responses can lead to same score
YES Higher/lower scores reflect more/less of the construct
Clinical hierarchy maintained calibrated items)
NO Meaning can change each time the scale is
used
YES Meaning of measurements same every time
scale used
Missing data is handled appropriately NO Reduce the power of your analysis YES Maximise all of the patient scores in a dataset
Invariance across the scale NO Score change across whole range scale does not mean same thing
YES Score change across whole range scale
means same thing
Automatic measurement in real time NO Impedes usability of the instrument and increases potential for error
YES Improves usability of the instrument /reduces error
Individual patient scores with bespoke standard errors
NO Unable to legitimately use individual patient scores
YES Can use individual patient scores
Tests for person fit NO Unable to legitimately examine patient fit YES Patterns of persons responses falling outside accepted range can be assessed
© L R Pendrill 180207 BEMC Berlin