03 - Common data model clinical data tables: laboratory test results as an example

(1)

Common Data Model Clinical

Data Tables: Laboratory Test

Results as an Example

Marsha A. Raebel, PharmD

Senior Investigator, Kaiser Permanente Colorado

Marsha.A.Raebel@kp.org

(2)

Development Principles of the Mini-Sentinel and

HMO Research Network Laboratory Results

Tables (LRT)

 Transparency

 Maximize use of existing data resources

 Stay as close to source data as possible

 Recognize disparities in electronic clinical data sources

 Leverage lab results reporting standards (e.g., LOINC) when feasible

 Seek guidance from those with expertise

– Investigators with clinical database and lab test result interpretation knowledge – Project managers with experience managing multiple sites

– Data partners representatives with knowledge of site-specific source data – Programmers/analysts with clinical results table and lab test expertise

(3)

Laboratory Test Results in the

Mini-Sentinel LRT (9/13)

(4)

Laboratory Test Results in the Mini-Sentinel

LRT (11/14)

 12 Data Partners participating

Date Range

Unique Lab Test

Results

Unique Patient IDs

(5)

Development and Implementation of the

Mini-Sentinel LRT

Detailed information:

Raebel MA, Haynes K, Woodworth TS, et al. Electronic Clinical Laboratory Test Results Data Tables: Lessons from Mini- Sentinel. Pharmacoepidemiol Drug Saf. 2014;23(6):609-18.

Current Mini-Sentinel LRT data dictionary:

(6)

Laboratory Procedures (administrative

data) vs. Results (clinical data) Tables

Laboratory Test Procedure

Tables

 Administrative data (e.g., CPT code; test done)

 Developed for billing

 Use standardized nomenclature and coding

 Not useful in defining cohorts, assessing outcomes, or

adjusting for confounders

Laboratory Test Results

Tables

 Clinical data (e.g., test result values)

 Developed for patient care

 Lack standardized

nomenclature and coding

 Useful for cohort identification, outcomes, confounder

(7)

Information in Source Data used to Extract Lab

Results across 12 Mini-Sentinel Data Partners

Extraction Source _{Partners Using Source}Number of Data

Test name/test substring search 8

LOINC 7

Component codes 6

Test-specific CPT codes 5

Site-specific codes 4

Test name & specimen type combination 2

Other 2

Battery/panel codes 0

(8)

Challenges in Developing Laboratory

Results Data into a Common Data Model

 Lab test results obtained during routine healthcare delivery

– No uniform coding or standard documentation. Use of standards (e.g., LOINC) is variable and inconsistent

– Vary across organizations and within an organization over time  Tests change over time

– Identifiers

– Result units

– Repeated re-evaluation necessary to ensure current and comprehensive incorporation of source and transformed data

 Result units

– Multiple (e.g., mmol/L, IU/L, mg/dl) for a single test require conversion to a standard unit

– Incomplete (e.g., number with no unit volume [no denominator])  Reference ranges

– Unique for every test type

(9)

(10)

Characterization, Harmonization, and

Quality Checking the Mini-Sentinel LRT



Transformed results data evaluated initially (e.g., upon

loading) and with each refresh



Assessments for each variable separately for each lab test

type include completeness, consistency, content, alignment

with specifications, patterns, and trends



Data distributions examined over time within and between

Mini-Sentinel Distributed Database refreshes



Feedback given to data partners with expectation that

anomalies be investigated, corrected, or otherwise

addressed

(11)

Examples

of Variations in Platelet (Quantitative)

Result Units in Source Data

(12)

Examples of

Variations in

(Qualitative)

Pregnancy Result

Units in Source

Data (aka, how

many ways can you

spell negative?)

NEGATIVE POSITIVE UNDETERMINED BORDERLINE BORDERLI NEG NONE DET POS COMMENT: 160.8 0.5 1.2 1000 122 14 140 15 2 2 2.1 203 252.3 278 28 3178.2 345 38.1 400 5 Int 5272.4 642.2 670 697.7 DETECTED INDETERM N NOT DETE Neg Negative Negatvie P Positive SPRCS TNP n . 820 840 1615 ABNORMAL BOARDERL BODERLIN CANCELLE DUPLICAT EQIVOCAL EQUIVOCA HIRABAYA NE-CHECK NEAGTIVE NEG (-) NEGA NEGA T I NEGA TIV NEGAT IV NEGATAIV NEGATIAV NEGATIBE NEGATIE NEGATRIV NEGATTVE NEGATVIE NEGAVTIV NEGITIVE NEGTIVE NETGATIV NORM NORMAL POA POPSITIV POSIITIV POSITIFV POSITTVE POSITVE POSOTIVE POSTIVE PSOITIVE REPEAT STAT URINE

(13)

Missingness/Completeness: Serum Creatinine

(sCR) Procedure Codes vs. Lab Result Values

 Modular program query of Mini-Sentinel LRT

 sCr laboratory test results and procedures (CPT) codes

 Entire Mini-Sentinel Distributed Database population

– Lab results for 90% - 100% of enrollees in integrated healthcare delivery systems

– Lab results for ~ 30% of enrollees in large national insurers

 Inform further assessment of missing LRT values

 Crudely estimate numbers and proportions of sCr test results with and without corresponding coded procedures

(14)

Serum Creatinine Procedure Codes vs. Lab

Result Values



>=55% of CPT codes from any care setting did not have lab

result values



~ 10% of lab result values from outpatient settings did not

have CPT codes



~ 75% of lab result values from inpatient settings did not

have CPT codes

(15)

Key Points about Developing and Implementing

a Multi-Site LRT into a CDM

Unique Challenges  Multiple source databases  No uniform coding  Few documentation standards

 Inter- and intra-organization variation  Every lab test has its own considerations

Ongoing Oversight Systematic Approach

 Engage experts

 Stay as close as possible to source data

 Decisions can be necessary on test-by-test basis

 Characterize data  Harmonize data  Quality check

 Provide feedback to sites

 Continuous monitoring and

management to keep valid and useful

 Identifies emerging themes and issues  Facilitates updates  Apply systematic approach