• No results found

How to calibrate a questionnaire: quality-assuring categorical data with psychometric measurement theory

N/A
N/A
Protected

Academic year: 2021

Share "How to calibrate a questionnaire: quality-assuring categorical data with psychometric measurement theory"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

© L R Pendrill 180207 BEMC Berlin

1

How to calibrate a questionnaire: quality-assuring

categorical data with psychometric measurement theory

L R Pendrill

Research Institutes of Sweden, Metrology,

Eklandagatan 86, 41261 Göteborg (SE),

phone:+46 767 88 54 44, mailto:leslie.pendrill@ri.se

(2)

2

(3)

© L R Pendrill 180207 BEMC Berlin

3

Man as Measurement Instrument: Counting

’M dots’

http://itre.cis.upenn.edu/~myl/CommuniqueMundurucuENG.pdf

(4)

4

Measuring Man:

- Status, function of person

- Test against specifications

Man as Measurement Instrument:

- Perception of product/service

function, comfort etc

- Propose improvements in product

Man as Measurement Instrument

(5)

© L R Pendrill 180207 BEMC Berlin

5

Quality-assured categorical

measurement

(6)

6

(7)

© L R Pendrill 180207 BEMC Berlin

7

Tukey [Chapter 8, Data analysis and behavioural science; quoted in “The collected works of John A Tukey, Volume III, Philosophy and principles of data analysis: 1949 – 1964”, ed. L V Jones, Univ. North Carolina, Chapel Hill

Counted fractions

”Beware of attempts to interpret correlations between ratios

whose numerators and demoninators contain common parts”

[Pearson 1897]

%

%

1

K k k j j

X

X

X

Relative-number problems:

• Counting (sheep & goats)

• How many affected at this dose

• How many of the pebbles are quartz…

j

X

j

(8)

8

Different scales of measurement

6

3

2

1

http://www.123rf.com

’Counted fractions’

%

%

1

K k k j j

X

X

X

(9)

© L R Pendrill 180207 BEMC Berlin

9

Logistic ruling

z

P

P

success

success





1

log

’Counted fractions’

(10)

10

Probabi

lity

K

k

k

c

k

c

p

P

q

1

,

P

c,k

= probability of classification c

when true level is k

p

k

= a priori probability that true level is k

Counted fractions

n

i

i

i

q

q

1

1

;

0

 

n

i

i

i

y

q

E y

C

y

b

y

b

i

c i

e

e

q

b = Lagrange multiplier

J M Linacre 2002, ” Optimizing Rating Scale Category Effectiveness”, Journal of Applied Measurement , 3:1 pp.85-106

y

1

,

y

2

,...,

y

C

y

c

R

(11)

© L R Pendrill 180207 BEMC Berlin

11

Kaffe.11

W P Fisher Jr. 1999

)

Difficulty

(

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠

=

𝑒

𝜃−𝛿

1 + 𝑒

𝜃−𝛿

(12)

12

Wright B.D. (1997) Fundamental measurement for outcome evaluation. Physical

medicine and rehabilitation : State of the Art Reviews. 11(2) : 261-288

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠

=

𝑒

𝜃−𝛿

1 + 𝑒

𝜃−𝛿

(13)

© L R Pendrill 180207 BEMC Berlin

(14)

14

(15)

© L R Pendrill 180207 BEMC Berlin

15

PT1 PT2 PT3 PT4 PT5 PT6

PT7 PT8 PT9 PT10 PT11 PT12 PT13 PT14 PT15 ∑P T

RMI-1

0

1

0

0

0

0

1

0

0

0

0

0

0

0

0

2

RMI-2

0

1

0

0

1

0

1

0

0

0

0

0

0

0

0

3

RMI-3

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

1

RMI-4

0

1

0

0

1

0

1

0

1

0

1

1

0

0

0

6

RMI-5

0

1

0

0

1

0

1

1

1

1

1

1

0

0

0

8

RMI-6

0

1

0

0

1

0

1

0

0

0

1

0

0

0

0

4

RMI-7

0

1

0

0

1

1

1

1

1

1

1

1

0

0

0

9

RMI-8

0

1

0

0

1

0

1

0

1

0

1

0

0

0

0

5

RMI-9

0

1

1

0

1

1

1

1

1

1

1

1

0

0

0

10

RMI-10

1

1

1

1

1

1

1

1

1

1

1

1

0

0

0

12

RMI-11

0

1

1

1

1

1

1

1

1

1

1

1

0

0

0

11

RMI-12

1

1

1

1

1

1

1

1

1

1

1

1

0

1

0

13

RMI-13

0

1

0

0

1

0

1

0

1

1

1

1

0

0

0

7

RMI-14

1

1

1

1

1

1

1

1

1

1

1

1

1

1

0

14

RMI-15

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

15

∑RMI

4

14

6

5

13

7

15

8

11

9

12

10

2

3

1

How does this work?

(16)

16

underestimated

7.

Ac

hiev

e

3

.

Length

Better

quality

Worse

quality

𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝐶𝑇𝑇

𝑖

=

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

𝐿

𝑗=1

𝐿

𝑙𝑛

1−𝑃

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

=𝜃

𝑖

− 𝛿

𝑗

(17)

© L R Pendrill 180207 BEMC Berlin

17

Measurement and ordinal data

Required

- epoch

(/Myear)

Measured

- depth (/cm)

https://cdn-assets.answersingenesis.org/img/articles/ee/v2/geological-layers.jpg

(18)

18

My

health?

Quality-assured measurement

Cognitive ability?

0,8 units ± 0,2 units

Object: Health

Person-centred care (PCC)

• Focus on health (not

illness)

• People partners in care

• More symptoms

• Impact on Activities of

Daily Living

• Subjective & perceptive

• …

(19)

© L R Pendrill 180207 BEMC Berlin

19

OECD Health Statistics 2015, http://dx.doi.org/10.1787/health_glance-2015-graph191-en

Variation in Primary Care

Indicators

Potential causes of

variation:

Disease prevalence

How physicians diagnose

How data coders interpret

diagnoses

(20)
(21)

© L R Pendrill 180207 BEMC Berlin

21

http://en.wikipedia.org/wiki/Bulletproof_vest

)

uality

q

(

leniency)

(

)

bility

a

(

)

hallenge

c

(

n)

penetratio

(

(

resistance

)

Rasch (1963)

𝑙𝑛

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

1−𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

=𝜃

𝑖

− 𝛿

𝑗

(22)

22

Example

• i

th

person of ability

i

faced with task of

δ

j

level of difficulty

• probability, P

success

, of achieving task

Category attribute

Object attribute,

δ

j

Person characteristic,

i

Satisfaction

Quality of product

User leniency

Difficulty

Level of difficulty of

activity

(Dis-)ability

Accessibility

Accessibility of

transport mode

Utility (or net benefit, …)

Rasch psychometric model

Logistic regression

)

bility

a

(

)

difficulty

(

General linearised model, link function, z:

𝑙𝑛

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

(23)

© L R Pendrill 180207 BEMC Berlin

23

Rasch (1961)

Level of ability

Level of difficulty

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠

=

𝑒

𝜃−𝛿

1 + 𝑒

𝜃−𝛿

𝜃 𝐴𝑏𝑖𝑙𝑖𝑡𝑦

𝛿 𝐷𝑖𝑓𝑓𝑖𝑐𝑢𝑙𝑡𝑦

(24)

24

Construct

map:

Vision

functionality

Pesudovs 2010

Man as Measurement Instrument

(25)

© L R Pendrill 180207 BEMC Berlin

25

(26)

26

Person

Tool

Task

Task - tool

Environment*

• Body structures*

• Body functions*

Person – tool – task

• Activity*

• Participation*

*Five components of health:

[International Classification of Functioning, Disability and Health (ICF)]

(27)

© L R Pendrill 180207 BEMC Berlin

27

underestimated

7. Sports

motiv

1. Healthy

Less

difficult

More

difficult

𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝐶𝑇𝑇

𝑖

=

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

𝐿

𝑗=1

𝐿

𝑙𝑛

1−𝑃

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗

=𝜃

𝑖

− 𝛿

𝑗

(28)

28

)

Difficulty

(

)

Ability

(

Rasch (1961)

Measuring People

Correct ordinal data treatment

Better resolution

(29)

© L R Pendrill 180207 BEMC Berlin

29

Balance as Measurement Instrument - Sensitivity (C

)

R = C·S

+ ”additional terms”

Stimulus (

S

): Mass of weight

Response (

R

):

Mass of weight x

Balance sensitivity

R

Measurand ’restitution’, S = C

cal

-1

·R

Calibration

R

cal

= C

cal

·S

cal

+ ”additional terms”

(30)

30

Man as Measurement Instrument - Sensitivity (C

)

R = C·S

+ ”additional terms”

Stimulus (

S

): Task difficulty

R

)

Ability

(

Measurand ’restitution’, S = C

-1

·R

Measurement systems

Response (

R

):

Task difficulty x

’Instrument’ sensitivity

𝑃

𝑠𝑢𝑐𝑐𝑒𝑠𝑠

=

𝑒

𝜃−𝛿

1 + 𝑒

𝜃−𝛿

𝛿 𝐷𝑖𝑓𝑓𝑖𝑐𝑢𝑙𝑡𝑦

(31)

© L R Pendrill 180207 BEMC Berlin

31

Metrological

references

Difficul

ty

Mas

s

Tasks

Physical disability

𝛿 𝐷𝑖𝑓𝑓𝑖𝑐𝑢𝑙𝑡𝑦

(32)

32

Adapted from: Cano, Hobart IOMW 2014

Functional Independence Measure (FIM)

Barthel Index (BI)

)

(

A

bility

of

patient

(33)

© L R Pendrill 180207 BEMC Berlin

33

N

TP

= 52

Less

able

More

able

𝜃 𝐴𝑏𝑖𝑙𝑖𝑡𝑦

TP32 +

TP42

TP47

k

= 2

P

su

cce

ss

=

50%

P

su

cce

ss

=

18%

P

su

cce

ss

=

98%

(34)

34

P

success

= 18%

P

success

= 98%

(35)

© L R Pendrill 180207 BEMC Berlin

35

)

(

A

bility

of

patient

(36)

36

My

health?

Quality-assured measurement

Cognitive ability?

0,8 units ± 0,2 units

Object: Health

Person-centred care (PCC)

• Focus on health (not

illness)

• People partners in care

• More symptoms

• Impact on Activities of

Daily Living

• Subjective & perceptive

• …

(37)

© L R Pendrill 180207 BEMC Berlin

37

NeuroMet

EMPIR 15HLT04: Innovative measurements for improved

diagnosis and management of neurodegenerative diseases

June 2016 – June 2019

Acknowledgments

The European Metrology Programme for Innovation & Research (EMPIR, Horizon2020, Art. 185) is jointly funded by the EMPIR participating countries within EURAMET (www.euramet.org) and the European Union in this EMPIR 15 HLT04 NeuroMet project (coordinator: LGC (UK))

(38)

38

Less able

More able

U nce rt a in Po s s ib le Proba bl e

𝑙𝑛

𝑃𝑠𝑢𝑐𝑐𝑒𝑠𝑠,𝑖,𝑗 1−𝑃

=𝜃

𝑖

− 𝛿

𝑗

Sum

𝐶

𝑇𝑇

𝑖

=

𝑃

𝑠𝑢𝑐𝑐𝑒

𝑠𝑠

,𝑖

,𝑗

𝐿 𝑗=

1

(39)

© L R Pendrill 180207 BEMC Berlin

39

Less difficult

More difficult

N

amin

g obj

ects

D

el

aye

d r

eca

ll

(40)

40

Metrological

references

Task

difficulty,

δ

Diffic

ul

ty

Mas

s

Naming objects

Orientation

Delayed recall

(41)

© L R Pendrill 180207 BEMC Berlin

41

PLOS ONE | DOI:10.1371/journal.pone.0162889 October 14, 2016

Under-estimate

Mini Mental State Examination

More able

Less able

Healthy

Mild

cognitive

impairment

AD

(42)

© L R Pendrill 180207 BEMC Berlin

42

PROPERTY CTT IMPLICATION RMT IMPLICATION

Group-level summary statistics legitimate YES Mean (SD) achievable for samples YES Mean (SD) achievable for samples

Item /person parameters can be estimated separately

NO Equivalence of measurement across not

guaranteed

YES All patients scored within an equivalent frame of reference

Total score is a sufficient statistic NO Different patterns of item responses can lead to same score

YES Higher/lower scores reflect more/less of the construct

Clinical hierarchy maintained calibrated items)

NO Meaning can change each time the scale is

used

YES Meaning of measurements same every time

scale used

Missing data is handled appropriately NO Reduce the power of your analysis YES Maximise all of the patient scores in a dataset

Invariance across the scale NO Score change across whole range scale does not mean same thing

YES Score change across whole range scale

means same thing

Automatic measurement in real time NO Impedes usability of the instrument and increases potential for error

YES Improves usability of the instrument /reduces error

Individual patient scores with bespoke standard errors

NO Unable to legitimately use individual patient scores

YES Can use individual patient scores

Tests for person fit NO Unable to legitimately examine patient fit YES Patterns of persons responses falling outside accepted range can be assessed

(43)

© L R Pendrill 180207 BEMC Berlin

43

Acknowledgments

This work has been performed as part of the 15HLT04 NeuroMet project which

belongs to the European Metrology Programme for Innovation & Research

(Horizon 2020), jointly funded by the EMRP participating countries within

EURAMET (www.euramet.org) and the European Union.

Thanks are due, particularly, to:

Members of the EMPIR NeuroMet consortium

Stefan Cano, Modus Outcomes, Boston (USA) & Stotford UK

William P. Fisher, Jr., Ph.D., Research Associate, BEAR Center, Graduate

School of Education, University of California, Berkeley, CA (USA) &

Principal, LivingCapitalMetrics Consulting

(44)

44

References

Related documents

However, based on the estimated change if the parameter were freely estimated, partial scalar invariance was achieved by removing equality constraints (releasing one constraint at

Where the hell is this filmed.”   76 Många verkar ha en mer klassisk syn på konst där konsten ska vara vacker och avbildande eller ifrågasätter budskapet i Stolz performance

The findings of this research suggest that an exogenous activation of self-control by the use of a simple verbal sentence inside the implementation intention structure

The findings from this thesis suggests that measuring innovation capability, through the process of first identifying KSF, and thereafter metrics, can be a valuable tool for

För att en artikel skulle inkluderas skulle den handla om (1) interventioner som utförs av ambulanspersonal i vårdutrymmet på vägburen ambulans, (2) hur dessa interventioner

Uppdraget omfattade jämförande prov av broms- och styrfriktion på våt is av sju olika typer och fabrikat av M+S klassade däck varav en däcktyp testas både med (däck nummer 8)

These investigated relations between time trends in sediment gross accumulation rates and wind forces are one way of describing sediment conditions in the archipelago of the NW