PLUGIN-Rosetta (FHIR to OMOP in LinkML)
Data integration Healthcare
PLUGIN-Rosetta Case Study Infobox
- Author: Daniel Kapitan (@dkapitan)
- Last updated: 2026-04-17
- Mapping Type:
- Status of this case study:
Integration of information models, schemas and mappings in LinkML.
Short title¶
PLGN-ROSETTA
Summary¶
Data engineers who are tasked to build real-world data pipelines are faced with disparate representations of information models (OMOP, FHIR, openEHR), vocabularies (SNOMED, Loinc) and mappings (OMOP vocabularies, Vulcan FHIR-to-OMOP) which they often want to integrate in their ETL code. Using LinkML as a common intermediate, PLUGIN-Rosetta aims to integrate the most commonly used clinical informatics models and mappings, easy and ready to use in design-time.
Domain¶
Healthcare
Use case category¶
Integration (Connecting data across disparate resources)
Purpose of the mapping¶
This mapping is part of a large programme to implement an open, hybrid data sharing platform for Dutch hospitals. Please refer to the data station specification for more context.
More specifically, PLUGIN-Rosetta aims to implement (a first version of) the requirements defined in the Information section of the data station specification. The target audience/users of this project are data engineers who need to implement many kinds of ETL transformations and mappings for data stations.
Type of mapped resources¶
Schemas. Mappings between the most widely used healthcare informatics models: FHIR, OMOP, openEHR and also Dutch-specific models such as LBZ and the Dutch Cancer Registry.
Entities. Mappings between the most widely used vocabularies, such as OMOP Athena and UMLS.
Links to existing mappings¶
The project aims to integrate many existing, commonly used information models and mappings. The current proof-of-principle has integrated FHIR R6 to OMOP CDM 5.4 using the FHIR to OMOP IG. The roadmap includes:
- EOS: openEHR to OMOP
- DHD diagnosis thesaurus: mapping of Dutch diagnosis vocabulary to SNOMED. Expected to become publicly available in 2026.
- DHD procedure thesaurus: mapping of Dutch vocabulary of procedures to SNOMED. Expected to become publicly available in 2026.
- Dutch Z-Index to OMOP: mapping pharmaceutical vocabulary used in Dutch healthcare institutions to OMOP concepts, thereby linking it to SNOMED and RxNorm. Publication being considered by ErasmusMC.
- Epilepsy diagnosis: mapping of free text vocabulary for epilepsy domain to SNOMED. Under construction in collaboration with DHD and SEIN.
Tools used for creating the mapping¶
- Built as a standalone Python package that can be used in developing ETL pipelines.
- Uses OpenCode agentic-coding assistant and roborev for generating deterministic scripts.
Type of mapping relations¶
Integrate existing mapping relations, such as defined by OMOP OHDSI Relation or in the LOINC Ontology.
The authors note that these mappings often lack precise mapping relation
definitions, such as skos:exactMatch, and are exploring how to improve
them through a combination of manual curation and agentic coding —
particularly for the Dutch vocabularies.
Examples (samples) of different types of mapping implementations¶
A minimal LinkML representation of the FHIR R4 resource fields that are
mapped to OMOP CDM 5.4 by the HL7 fhir-omop-ig (tag 1.0.0-ballot). Each
mapped slot carries a list of exact_mappings pointing to OMOP CDM fields,
e.g.:
classes:
Patient:
class_uri: fhir:Patient
attributes:
gender:
range: string
slot_uri: fhir:Patient.gender
description: male | female | other | unknown
exact_mappings:
- omop:gender_concept_id
- omop:gender_source_value
birthDate:
range: date
slot_uri: fhir:Patient.birthDate
exact_mappings:
- omop:birth_datetime
- omop:year_of_birth
- omop:month_of_birth
- omop:day_of_birth
The full schema covers Patient, Encounter, Condition, Observation, Procedure, MedicationStatement, Immunization and AllergyIntolerance and is available in the linkml-rosetta repository.