• No results found

04 - EHR data methodologies in clinical research: perspectives from the field. Session 1: Semantic harmonization: definition; content; ontologies. Common data models for sharing EHR data across settings

N/A
N/A
Protected

Academic year: 2021

Share "04 - EHR data methodologies in clinical research: perspectives from the field. Session 1: Semantic harmonization: definition; content; ontologies. Common data models for sharing EHR data across settings"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

EHR Data Methodologies in Clinical Research:

Perspectives from the Field

Michael G. Kahn MD, PhD Professor, Pediatric Epidemiology

University of Colorado Michael.Kahn@ucdenver.edu

11 December 2014

Funding by: PCORI Contract ME-1303-5581 (Kahn), NCATS UL1TR001082 (Sokol), AHRQ R01HS022956 (Schilling)

1

Session 1: Semantic Harmonization; Definition; Content; Ontologies

Common Data Models for Sharing EHR data across Settings

(2)

Disclosures

Presentation based on EDM Forum commissioned paper:

2

(3)

A common data model is critical!

EHR-1

Local

Data

Warehouse

Other

EHR-2

Clinical

Registries

Other

EHR-3

Limited Data Set Common Data Model Common Terminology

Common Query

Limited Data Set Common Data Model Common Terminology

Limited Data Set Common Data Model Common Terminology

(4)

What is a data model & why should I

care?

• A data model determines:

– What data elements can be stored

– What relationships between data

can be represented

– Technical stuff: data type,

allowed ranges, required versus

optional (missingness)

• You should care because it

determines:

– How easy can data be recorded, extracted

& queried

– Contributes to data quality

(5)

Visit-centric versus Patient-centric

(6)

Query: “For various age groups, how many

medications where filled?”

6

Four-table join

(7)

Query: “For various age groups, what is the

average number of prescriptions per visit?”

7

Two-table join

Three-table join +

Date comparisons

(8)

SAFTINet Asthma Cohort Definition

• Adults (ages 18 and over) as of Jan 1, 2009 receiving care in

selected sites who:

– Have had at least 2 visits separated by at least 30 days coded as 493.xx

in the 18 months prior to July 1, 2011, OR

– A single diagnosis of 493.xx AND two filled prescriptions for an asthma

maintenance medication separated by at least 30 days in the past 12

months.

• Exclusion criteria: Patients with other concomitant chronic lung

disease

– Cystic fibrosis

– COPD, emphysema, chronic bronchitis

– Alpha-1-antitrypsin deficiency

– Pulmonary fibrosis

– Active TB

(9)

Key questions for a data model

• From Jeff Brown regarding FDA Sentinel

Initiative*:

1. What does the system need to do?

2. What data are needed to meet

system needs?

3. Where will the data be stored?

4. Where will the data be analyzed?

5. Is a common data model needed,

and if so, what will the model look like?

9

*Brown JS, Lane K, Moore K, Platt R. Defining and evaluating possible database models to implement the FDA Sentinel initiative. U.S. Food and Drug Administration; May 2009 2009.

(10)

Eight dimensions of data models

Modified from Moody and Shanks*

Dimension

Original definition

Recasted definition for CER

1. Completeness Does the data model contain all user requirements?

Can the data model store and retrieve data to meet investigator CER needs?

2. Integrity Does the data model conform

to the business rules and processes to guarantee data integrity and enforce policies?

Does the data model enforce meaningful data relationships and constraints that uphold the intent of the data’s original purpose, i.e., clinical care, billing?

3. Flexibility Does the data model deal with changes in business and/or regulatory change?

Can new data elements and

relationships be added if project scope or if regulatory rules (e.g., patient

identification) changes? 4. Understandability Are the concepts and

structures in the data model easily understood?

Do the concepts, structures and relationships make sense to

investigators, data managers, and statisticians?

10

(11)

Dimension

Original definition

Recasted definition for CER

5. Correctness Does the data model conform to the rules of the data

modeling technique?

Does the model conform to good data modeling practices such as limited data storage redundancy?

6. Simplicity Does the data model contain

the minimum possible entities and relationships?

Are concepts represented as

straightforwardly as possible? Are all data element necessary?

7. Integration Is the data model consistent with the rest of the

organization’s data?

Do all of the various data domains, such as demographics, observations, labs and medications “hang together” in a consistent and logical fashion? 8. Implementability Can the data model be

implemented within existing time, budget, and technology constraints?

Can the data model be implemented and maintained by current and future partners given anticipated budgets, time, and technical constraints?

11

Eight dimensions of data models

Modified from Moody and Shanks*

(12)

Major common data models

Name

Developing entity

Initial Purpose

Observational

Medical

Outcomes Project

(OMOP)

Foundation of the

NIH, now

Reagan-Udall Foundation

Comparative Drug Outcomes Studies

i2b2

Partners Healthcare

Informatics framework for clinical

and biological data integration.

Widely used across NCATS CTSAs

HMORN Virtual

Data Warehouse

(VDW)

HMO Research

Network

Distributed data warehouse to allow

comparative studies across

collaborating sites: HMORN, CRN,

Oregon CTRI

Mini-Sentinel FDA

Derivative of VDW focused on

large-scale drug surveillance

PCORnet PCORI

Derivative of Mini-Sentinel focused

on PCOR research

(13)

A common data model is critical!

EHR

Local

Data

Warehouse

Other

EHR

Clinical

Registries

Other

EHR

Limited Data Set Common Data Model Common Terminology

Common Query

Limited Data Set Common Data Model Common Terminology

Limited Data Set Common Data Model Common Terminology

Crossing the CER chasm !!

(14)

Public domain tools to help “Cross the CER Chasm”

• Data profiling with OHDSI White Rabbit

• Data transformation documentation with OHDSI Rabbit in a Hat

14

(15)

Public domain tools to help “Cross the CER Chasm”

• Data profiling with OHDSI White Rabbit

• Data transformation documentation with OHDSI Rabbit in a Hat

15

(16)

Ensuring Data Consistency/Comparability

(17)

EHR Data Methodologies in Clinical Research:

Perspectives from the Field

Michael G. Kahn MD, PhD Professor, Pediatric Epidemiology

University of Colorado Michael.Kahn@ucdenver.edu

11 December 2014

Funding by: PCORI Contract ME-1303-5581 (Kahn), NCATS UL1TR001082 (Sokol), AHRQ R01HS022956 (Schilling)

17

Session 1: Semantic Harmonization; Definition; Content; Ontologies

Common Data Models for Sharing EHR data across Settings

References

Related documents

som en grund (Respondent 6). Respondent 6 är också tydlig med att det finns en utmaning med att vi tvingar elever till digitaliseringen. Alla lärare måste se till att alla

Sustainable Digital Data Preservation and Access Network Partners

Keyword: Object Relational Mapping (ORM), Generated Intelligent Data Layer (GIDL), Relational Database, Microsoft SQL Server, Object Oriented Design Pattern, Model,

The task outlined below concerns building a common post- processing tool for climate model data sets, cdo , to include options for

The standard answer is to try the models (10) of different orders using PEM/ML methods, use the cross-validation in Section 3.6 to pick the best model order, and finally use the

In 2012 he joined the Center of Applied Autonomous Sensor Systems (AASS) of Örebro University in Sweden as a doctoral student. His research interests include various aspects of

This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data,

Gruppmedelvärden för PRE- och POST-mätningarna, 5-gradig svarsskala; 1(pos) –5 (neg). Wilcoxon signed-rank test. Vi kan konstatera att riktningen varierar beroende på