Title: Mining electronic health records: towards better research applications and clinical care
1Mining electronic health records towards better
research applications and clinical
care Standardising the representation of
clinical information for patient care and for
research
Dipak Kalra Professor of Health
Informatics University College London
2EHR trends
- Patient-centered (gatekeeper?), life long records
- Multi-disciplinary / multi-professional
- Transmural, distributed and virtual
- Structured and coded (cf. semantic
interoperability) - More metadata and coding at a granular level !
- Intelligent (cf. decision support), clinical
pathways - Predictive (e.g. genetic data, physiological
models) - More sensitive content (privacy protection)
- Personalised
- Pervasive bio-sensors, wearables...
Georges De Moor
3Capturing and combining diverse sources of
information
Dipak Kalra
4Towards integrated health
Biosensors
Genomic data
Environmental Data
Phenomic data
Integrated Electronic Health Records
Georges De Moor
5The rich re-use of Electronic Health Records
Dipak Kalra
6Requirements the EHR must meet ISO 18308
The EHR shall preserve any explicitly defined
relationships between different parts of the
record, such as links between treatments and
subsequent complications and outcomes.
The EHR shall preserve the original data values
within an EHR entry including code systems and
measurement units used at the time the data were
originally committed to an EHR system.
The EHR shall be able to include the values of
reference ranges used to interpret particular
data values.
The EHR shall be able to represent or reference
the calculations, and/or formula(e) by which data
have been derived.
The EHR architecture shall enable the retrieval
of part or all of the information in the EHR that
was present at any particular historic date and
time.
The EHR shall enable the maintenance of an audit
trail of the creation of, amendment of, and
access to health record entries.
Dipak Kalra
7Interoperability standards relevant to the EHR
8ISO EN 13606-1 Reference Model
Dipak Kalra
9Contextual building blocks of the EHR
Part or all of the electronic health record for
one person, being communicated
EHR Extract
High-level organisation of the EHRe.g. per
episode, per clinical speciality
Folders
Set of entries comprising a clinical care
session or document e.g. test result, letter
Compositions
Sections
Headings reflecting the flow of information
gathering, or organising data for readability
Entries
Clinical statements about Observations, Evaluati
ons, and Instructions
Clusters
Multipart entries, tables,time series,e.g. test
batteries, blood pressure, blood count
Elements
Element entries leaf nodes with valuese.g.
reason for encounter, body weight
Data values
Date types for instance values e.g. coded terms,
measurements with units
Dipak Kalra
10In a generated medical summary
List of diagnoses and procedures
Dipak Kalra
11Clinical interpretation context
Dipak Kalra
12Examples of clinical interpretation context
- within the overall clinical story
- past, present
- intended treatments, planned procedures
- clinical circumstances of an observation
- e.g. standing, fasting
- presence / absence / certainty of the finding
- hypotheses, concerns
- a diagnosis for a relative
- but not the patient!
- confidence and evidence
- seniority of the author
- justification, clinical reasoning, guideline
references
Dipak Kalra
13Examples of medico-legal context
- Authorship, responsibilities, signatories
- Dates and times
- occurrence, clinical encounter, recording,
schedules, intentions - Information subjects
- whose record is this? (who is the patient?)
- about whom is this observation? (e.g. family
history) - who provided this information
- Version management
- Access privileges
- which need to be defined in ways that can be
interpreted across organisational and national
boundaries - Consents
Dipak Kalra
14Clinical information standards
- Formally model clinical domain concepts
- e.g. smoking history, discharge summary,
fundoscopy - Encapsulate evidence and professional consensus
on how clinical data should be represented - published and shared within a clinical community,
or globally - imported by vendors into EHR system data
dictionaries - Support consistent data capture, adherence to
guidelines - Enable use of longitudinal EHRs for individuals
and populations - Define a systematic EHR target for queries for
decision support and for research
Archetypes (openEHR and ISO 13606-2)
Dipak Kalra
15Example archetype for adverse reaction
Dipak Kalra
16openEHR Clinical Knowledge Manager
17Using archetypes for querying EHR repositories
Dipak Kalra
18Example clinical questions
- Find the age and gender of patients who have been
diagnosed with Hodgkin's disease, where the
initial diagnosis occurred between the ages 50
and 70 inclusive - What is the percentage of patients diagnosed with
primary breast cancer in the age range 30 to 70
who were surgically treated and had post
operative haematoma/seroma? - What percentage of patients with primary breast
cancer who relapsed had the relapse within 5
years of surgery? - What is the average survival of patients with
Chronic Myeloid Leukaemia (CML) and both with and
without splenomegaly at diagnosis?
Dipak Kalra
19Semantic interoperability
- New generation personalised medicine
underpinned by -omics sciences
and translational research needs to integrate data
from multiple EHR systems with data from
fundamental biomedical research, clinical and
public health research and clinical trials - Clinical data that are shared, exchanged
and linked to newknowledge need to be formally re
presented to become machine processable. - This is more than just adopting existing standards
or profiles, it is mapping clinical content
to a commonly understood meaning - One can exchange in a perfectly standardised
message complete meaningless information, hence
the importance of content-related quality
criteria (clinically meaningful) and of true
semantic interoperability
Dipak Kalra
20EHR and knowledge integration
These areas need to be represented
consistently to deliver meaningful and safe
interoperability
Dipak Kalra
21EHR reference model data types near-patient
device interoperability archetypes templates
architecture identifiers for people policy
models structural roles functional roles purposes
of use care settings pseudonymisation
Consistent representation, access and
interpretation
Rich EHR interoperability
guidelines care pathways continuity of care
clinical terminology systems terminology
sub-sets value sets and micro-vocabularies term
selection constraints post-co-ordination terminolo
gy binding to archetypes semantic context
model categorial structures
Dipak Kalra
22ARGOS semantic interoperability recommendations
- Nine strategic actions that now need to be
championed,as a global mission - 1. Establish good practice
- 2. Scale up semantic resource development
- 3. Support translations
- 4. Track key technologies
- 5. Align and harmonise standardisation efforts
- 6. Support education
- 7. Assure quality
- 8. Design for sustainability
- 9. Strengthen leadership and governance
Dipak Kalra
23Semantic interoperability resource priorities
- Widespread and dependable access to maintained
collections of coherent and quality-assured
semantic resources - clinical models, such as archetypes and templates
- rules for decision making and monitoring
- workflow logic
- which are
- mapped to EHR interoperability standards
- bound to well specified multi-lingual terminology
value sets - indexed and correlated with each other via
ontologies - referenced from modular (re-usable) care pathway
components - SemanticHealthNet will establish good practices
in developing such resources - using practical exemplars in heart failure and
coronary prevention - involving major global SDOs, industry and patients
Dipak Kalra
24Accelerating and leveraging knowledge discovery
- We need to accelerate the discovery of new
knowledge from large populations of existing
health records - EHRs can provide population prevalence data and
fine grained co-morbidity data to optimise a
research protocol, and help identify candidates
to recruit - almost half of all pharma Phase III trial delays
are due to recruitment problems
Dipak Kalra
25Electronic Health Records for Clinical Research
- The IMI EHR4CR project runs over 4 years
(2011-2014) with a budget of 16 million - 10 Pharmaceutical Companies (members of EFPIA)
- 22 Public Partners (Academia, Hospitals and SMEs)
- 5 Subcontractors
- One of the largest public-private partnerships
- Providing adaptable, reusable and scalable
solutions (tools and services) for reusing data
from EHR systems for Clinical Research - EHRs offer significant opportunity for the
advancement of medical research, the improvement
of healthcare, and the enhancement of patient
safety
3
26The EHR4CR Scenarios
- Protocol feasibility
- Patient identification recruitment
- Clinical trial execution
- Serious Adverse Event reporting
- across different therapeutic areas (oncology,
inflammatory diseases, neuroscience, diabetes,
cardiovascular diseases etc.) - across several countries (under different legal
frameworks)
9
27EHR4CR will deliver
- Requirements specification
- for EHR systems to support clinical research
- for integrating information across hospitals and
countries - Innovative Business Model
- for sustainability
- to stimulate the marketplace
- Technical Platform (tools and services)
- Pilots for validating the solutions
- different scenarios
- different therapeutic areas
- several countries
5
28CHAPTER Centre for Health service and Academic
Partnership in Translational E-Health
Research Co-ordinator Prof Harry Hemingway
29TRANSLATIONAL CYCLE
CLINICAL RESEARCH PROGRAMMES Cardiovascular
(UCLH BRC, QMUL BRU) Maternal Child health
(GOSH BRC) Infection (BRC, HPA) Neurodegeneration
(UCLH, BRU) Eyes (Moorfields, BRC)
INFORMATICS CYCLE
CHAPTER
30New UCLP Informatics Platform
Beneficiaries
CHAPTER portal interface to beneficiaries
Secure Data Warehouse in NHS Trusted
Party CHAPTER harmonizes consent, linkage, data
sharing, anonymization, IG
NHS
CHAPTER
31The IMI is a unique Public-Private Partnership
(PPP) between the pharmaceutical industry
represented by the European Federation of
Pharmaceutical Industries and Associations
(EFPIA) and the European Union represented by the
European Commission
32EMIF Project Vision
To enable and conduct novel research into human
health by utilising human health data at an
unprecedented scale
- Think Big
- Access to information on gt 40 million patients
- AD research on 10-times more subjects than ADNI
- Metabolics research on gt 20,000 obese T2DM
subjects - Linkage of clinical and omics data
- Development of a secure (privacy, legal) modular
platform - Continue to build a network of data sources and
relevant research
33Think Big
- Co-ordinator Janssen
- Bart Vannieuwenhuyse
- 60 partners (3 consortia Efpia)
- 170 individuals involved
- 14 European countries represented
- 48 MM worth of resources (in-kind / in-cash)
- 3 projects in one
34Project objectives
- EMIF one project three topics
- EMIF-Platform Develop a framework for
evaluating, enhancing and providing access to
human health data across Europe, to support the
two specific topics below as well as research
using human health data in general - Lead Prof. Johan van der Lei, Erasmus University
Rotterdam - EMIF-Metabolic Identify predictors of
metabolic complications in obesity, with the
support of EMIF-Platform - Lead Prof. Ulf Smith, University of Gothenburg
- EMIF-AD Identify predictors of Alzheimers
Disease (AD) in the pre-clinical and prodromal
phase, with the support of EMIF-Platform - Lead Prof. Simon Lovestone, Kings College London
35EMIF platform for modular extension
EMIF governance
Metabolic
CNS
Research Topics
EMIF - Metabolic
EMIF - AD
Data Privacy
Analytical tools
EMIF - Platform
Semantic Integration
Information standards
Data access / mgmt
36Key objectives EMIF-Metabolic
- A detailed understanding of the inter-individual
variability in susceptibility to specific
metabolic complications of obesity (i.e.
diabetes, dyslipidemia, and liver steatosis and
cancers) and the specific effects of the
different constitutional, environmental and
obesity-specific factors. - The identification of novel susceptibility
markers for metabolic complications of obesity
genetic, epigenetic and omics platforms - The identification and characterization of
high-risk individuals for targeted interventions. - The development of an algorithm leading to a
diagnostic test that would predict high risk for
the metabolic complications of obesity. - The identification of novel targets or pathways
for future therapeutic interventions.
37Key objectives EMIF-AD
- Collection of data required for the development
and validation of new biomarkers for AD - Characterisation of study population and
definition of extreme phenotypes - Discovery of new biomarkers for the diagnosis and
prognosis of predementia AD - Validation of new biomarkers and development of
strategies for selection of subjects in AD
prevention trials
38Key objectives EMIF-Platform
- Access to harmonised data
- Access to harmonised patient medical information
from different data sources across Europe - comprehensive health data comprising clinical,
biomarker and other detailed health information
on a number of populations and specific cohorts
(pediatrics, adults, including vulnerable
groups). - Governance
- Procedures and SOPs that govern access and
utilisation of patient level data - Robust measures to enable linkage and sharing
whilst preserving privacy - Tools
- Solutions in the areas of data privacy and
ethics, standards and semantic interoperability - patient health data linkage and access to a
combined patient health information base - Business Model
- That governs the use of the project output as
well as the support for future research projects
39Researcher
Browsing through directory of data fingerprints
Controlled data access based on usage rights
(Private Remote Research Environments)
Common Data Model
Analytical tools / methods
40Challenges with re-use of patient level data
41Long-term view
Clinical Care
Clinical Research