The SSSOM/RDF serialisation format
This section defines how to represent a SSSOM mapping set as a RDF model.
RDF formats
The RDF model that represents a SSSOM mapping set is independent of the concrete format that may be used to serialise the model.
It is RECOMMENDED that implementations support reading and writing a SSSOM set from and to the RDF Turtle format at least. They MAY support any other RDF concrete format (e.g. RDF/XML, TriG, N-Triples, etc.).
This specification does not mandate how a concrete RDF syntax is to be used. For example, if the RDF syntax allows named resources and predicates to be serialised as either IRIs or CURIEs, it is left to the discretion of the implementations (or their users) to decide which form to use.
Representation of slots
A metadata slot on any given SSSOM object (such as a Mapping
or a
MappingSet
) MUST be represented as a RDF triple where:
- the subject is the resource representing the SSSOM object;
- the predicate is either:
- the property indicated by the
URI
field in the LinkML description of the slot, if such a field is present; - or a property constructed by concatenating the
https://w3id.org/sssom/
namespace and the name of the slot;
- the property indicated by the
- the object is the value of the slot.
Representation of slot values
The following rules determine how the value of a slot is represented as the object of a RDF triple.
For slots typed as sssom:EntityReference
(e.g. subject_id
, mapping_justification
, subject_source
…)
The value MUST be represented as a named RDF resource (IRI).
For slots typed as sssom:NonRelativeURI
(e.g. license
, mapping_provider
, issue_tracker
…)
The value MUST be represented as a named RDF resource (IRI).
For slots typed as linkml:date
(e.g. mapping_date
, publication_date
)
The value MUST be represented as a xsd:date
literal.
For slots typed as linkml:double
(e.g. mapping_set_confidence
, confidence
, similarity_score
)
The value MUST be represented as a xsd:double
literal.
For slots typed as an enumeration
(e.g. sssom_version
, mapping_cardinality
, subject_type
…)
If the permissible values for the enumeration are defined in the LinkML model as
having an associated meaning
property, then the value MUST be represented as a
named RDF resource with the indicated property. Otherwise, the value MUST be
represented as a xsd:string
literal.
Examples:
A
subject_type
slot with the valueowl class
is represented by:
ttl ?object sssom:subject_type <http://www.w3.org/2002/07/owl#Class> .
while a
mapping_cardinality
slot with the value1:1
is represented by:
ttl ?object sssom:mapping_cardinality "1:1"^^xsd:string .
because the
owl class
value of theEntityTypeEnum
enumeration has ameaning
property ofhttp://www.w3.org/2002/07/owl#Class
, while the1:1
value of theMappingCardinalityEnum
enumeration has nomeaning
property.
For slots typed as a SSSOM object
(e.g. mappings
, extension_definitions
)
The value MUST be represented as a RDF resource. Whether the resource is named (IRI) or not (blank node) will depend on the type of the object, see the section on representing SSSOM objects below for details.
Representation of multi-valued slots
(e.g. creator_id
, see_also
, object_match_field
…)
As an exception to the general principle that slots are represented by a single RDF triple, multi-valued slots MUST be represented by as many triples as there are values, each value being the object of one triple.
Non-normative notes:
- This means, in particular, that RDF complex structures intended to represent collections of values, such as
rdfs:Container
orrdfs:List
, MUST NOT be used to represent multi-valued SSSOM slots.- This also implies that values in multi-valued slots are not ordered.
The other rules above apply to determine how each single value is to be represented.
Example:
A
creator_id
slot with the valueshttps://example.org/people/0001
andhttps://example.org/people/0002
is represented by the following two triples:
ttl ?object dcterms:creator <https://example.org/people/0001> . ?object dcterms:creator <https://example.org/people/0002> .
Representation of extension slots
An extension slot MUST be represented in a similar way to a standard slot, with the following specific rules.
The predicate is the property associated to the extension slot, as indicated by
the property
slot in the set’s definition of the
extension.
The value of the extension MUST be represented:
- as a named RDF resource, if the
type_hint
of the extension definition islinkml:uriOrCurie
; - otherwise, as a literal of the type indicated by the
type_hint
.
Representation of SSSOM objects
Representation of a Mapping
object
The RDF type of a Mapping
object is owl:Axiom
.
If the Mapping
object has a record_id
slot, then the value of that slot MUST
be used as the named RDF resource that represents the object (and consequently,
that slot MUST NOT be represented using the general rules for
the representation of slots as defined above). Otherwise, the Mapping
object
is represented as a blank node.
Representation of a MappingSet
object
The RDF type of a MappingSet
object is sssom:MappingSet
.
A MappingSet
object MUST be represented by a named RDF resource corresponding
to the value of the mapping_set_id
slot (which consequently MUST NOT be
represented using the general rules for the representation slots
as defined above).
The curie_map
slot MUST NOT be represented using the
general rules. Instead, if it is needed it MUST be represented
using whatever mechanism is provided by the concrete RDF serialisation format
(e.g. @prefix
declarations in RDF Turtle or
RDF TriG, or xmlns
namespace declarations in
RDF/XML).
Non-normative notes
- The CURIE map may not be needed at all if all named resources and predicates are always serialised as full-length IRIs.
- If at least some named resources or predicates are serialised as CURIEs, the RDF requirement that all used prefix names must be declared (using the appropriate mechanism for the chosen concrete syntax) takes precedence over the possibility of omitting the declarations of prefix names that are considered built-in in the context of SSSOM.
Representation of an ExtensionDefinition
object
The RDF type of an ExtensionDefinition
object is sssom:ExtensionDefinition
.
An ExtensionDefinition
object has no identifier of any kind and is always
represented by a blank node.
Special considerations for serialising to RDF
When serialising a mapping set to SSSOM/RDF, implementations should consider how the resulting RDF file is intended to be used. In particular, they should ponder whether it is expected that the RDF serialisation can at any time be converted back to any other SSSOM format (e.g. SSSOM/TSV), or if it is only intended to be used by “generic”, non-SSSOM-aware RDF applications.
Depending on that intended usage (if it is known), implementations may adopt slightly different behaviours as described in the following subsections.
Serialisations of identifiers
If the serialisation is intended to be convertible back to another SSSOM format (especially the SSSOM/TSV format), implementations MUST declare all the prefixes found in the CURIE map and SHOULD serialise all identifiers as CURIEs using said declared prefixes.
Non-normative explanation
This is because, if all identifiers are serialised as full-length IRIs, then even if the RDF file includes prefix declarations, they may be stripped away by a RDF reader, since they are not needed. And without those prefix declarations, it would not be possible to serialise the set back as a SSSOM/TSV file (remember that the SSSOM/TSV format requires that identifiers be serialised as CURIEs).
Conversely, if the ability to convert the RDF file back to another SSSOM format is not required, implementations can freely decide whether to serialise identifiers as IRIs or CURIEs (assuming the concrete RDF syntax allows that of course).
Extension definitions
Extension definitions MAY be omitted if the RDF file is only intended to be used by RDF applications.
Conversely, they SHOULD be included if the set is intended to be convertible back to another SSSOM format.
Non-normative explanation
The whole point of an extension definition in SSSOM is to provide (1) a property that confers some meaning to the extension, and (2) the type of the expected values. In RDF, as described above, those two bits of information are already contained in the triple that represents the extension slot, so there is no need for an additional definition.
But the extension definition also provides the
slot_name
which is used to represent the extension slot in other formats (especially SSSOM/TSV), so if conversion back to other SSSOM formats is required, ensuring that the extension definitions are present in the RDF serialisation is helpful.
Propagation and condensation
Propagatable slots can be represented in RDF indifferently in their propagated or condensed form, following the normal rules for propagation and condensation.
But if the RDF file is intended to be used by generic, non-SSSOM-aware RDF applications, then implementations SHOULD serialise propagatable slots in their propagated form.
Non-normative explanation
Propagation is a SSSOM-specific concept. If a RDF application is provided with a RDF file representing a set with condensed slots, the application will not know to propagate the condensed slots at the set level down to the level of the individual mappings, which will result in the application having an incomplete view of the mappings.
Representation of mappings as “direct triples”
For every single mapping record in a set, implementations MAY additionally inject a single triple of the form:
?subject_id ?predicate_id ?object_id .
If so, that behaviour MUST be optional.
When that behaviour is enabled, implementations SHOULD NOT inject such triples in the following cases:
- when the record represents a literal mapping (that is,
subject_type
orobject_type
– or both – is set tordfs literal
); - when the record represents a negated mapping (that is,
predicate_modifier
is set toNot
); - when the record represents an absence of match (that is,
subject_id
orobject_id
– or both – is set tosssom:NoTermFound
).
In any case, a SSSOM/RDF reader MUST NOT expect the presence of such triples, and if they are present MUST NOT use them to construct mapping records.
Non-normative explanations
Such “direct triples” are merely a convenience for downstream RDF applications, allowing them to find a direct link (as a single triple) between the subject and the object of a mapping, without having to construct such a link by following the
owl:annotatedSource
,owl:annotatedProperty
, andowl:annotatedTarget
triples.It is recommended not to inject such direct triples for literal mapping records, even if they do have a
subject_id
and anobject_id
, because by definition the subject and/or the object of such records is not an identifiable semantic entity and has no business being represented in a RDF graph.It is recommended not to inject such direct triples for negated mapping records because they would seem to convey a meaning that is the exact opposite of what the records mean.
It is recommended not to inject such direct triples for no-match mapping records since they do not represent a real mapping.
Compatibility with pre-standard RDF representations
The present specification of the SSSOM/RDF format differs slightly from what several implementations of SSSOM have been producing before the format was formally specified.
In the name of backward compatibility, implementations MAY support the alternative rules described in the following subsections when deserialising from RDF.
Implementations MUST NOT follow these rules when serialising to RDF.
Representation of slots typed as sssom:NonRelativeURI
Implementations MAY accept a value represented as a xsd:anyURI
literal.
For example, implementations MAY accept
?mapping sssom:mapping_provider "https://www.ohdsi.org/"^^xsd:anyURI .
as an alternative to
?mapping sssom:mapping_provider <https://www.ohdsi.org/> .
Representation of slots typed as an enumeration
Implementations MAY accept a value represented as a string literal, even if the
value is defined in the LinkML model as having an associated meaning
property.
For example, implementations MAY accept
?mapping sssom:predicate_modifier "Not"^^xsd:string .
as an alternative to
?mapping sssom:predicate_modifier sssom:NegatedPredicate .
Representation of a MappingSet
object
Implementations MAY accept a MappingSet
object represented as a blank node,
with the mapping_set_id
slot being represented as any other slot.
For example, instead of
<https://example.org/myset> a sssom:MappingSet .
implementations MAY accept
[] a sssom:MappingSet ;
sssom:mapping_set_id <https://example.org/myset> .
or even (by also applying the alternative rule regarding the representation of
slots typed as sssom:NonRelativeURI
)
[] a sssom:MappingSet ;
sssom:mapping_set_id "https://example.org/myset"^^xsd:anyURI .
Examples
This section is non-normative.
Considering the following set in the SSSOM/TSV format:
#curie_map:
# EXT: https://example.org/properties/
# FOODON: http://purl.obolibrary.org/obo/FOODON_
# KF_FOOD: https://kewl-foodie.inc/food/
# ORCID: https://orcid.org/
#mapping_set_id: https://example.org/sample-set
#mapping_set_description: Manually curated alignment of KEWL FOODIE INC internal food and nutrition database with Food Ontology (FOODON). Intended to be used for ontological analysis and grouping of KEWL FOODIE INC related data.
#license: https://creativecommons.org/licenses/by/4.0/
#mapping_date: 2025-07-14
#extension_definitions:
# - slot_name: ext_fooable
# property: EXT:isFooable
# type_hint: xsd:boolean
subject_id subject_label predicate_id object_id object_label mapping_justification author_id confidence ext_fooable
KF_FOOD:F001 apple skos:exactMatch FOODON:00002473 apple (whole) semapv:ManualMappingCuration ORCID:0000-0002-7356-1779 0.95 true
KF_FOOD:F002 gala skos:exactMatch FOODON:00003348 Gala apple (whole) semapv:ManualMappingCuration ORCID:0000-0002-7356-1779 1 false
A valid serialisation of that set in RDF/Turtle would be:
@prefix EXT: <https://example.org/properties/> .
@prefix FOODON: <http://purl.obolibrary.org/obo/FOODON_> .
@prefix KF_FOOD: <https://kewl-foodie.inc/food/> .
@prefix ORCID: <https://orcid.org/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix pav: <http://purl.org/pav/> .
@prefix semapv: <https://w3id.org/semapv/vocab/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix sssom: <https://w3id.org/sssom/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<https://example.org/sample-set> a sssom:MappingSet;
dcterms:description "Manually curated alignment of KEWL FOODIE INC internal food and nutrition database with Food Ontology (FOODON). Intended to be used for ontological analysis and grouping of KEWL FOODIE INC related data.";
dcterms:license <https://creativecommons.org/licenses/by/4.0/>;
sssom:extension_definitions [
sssom:property EXT:isFooable;
sssom:slot_name "ext_fooable";
sssom:type_hint xsd:boolean
];
sssom:mappings [ a owl:Axiom;
pav:authoredBy ORCID:0000-0002-7356-1779;
dcterms:created "2025-07-14"^^xsd:date;
owl:annotatedProperty skos:exactMatch;
owl:annotatedSource KF_FOOD:F001;
owl:annotatedTarget FOODON:00002473;
EXT:isFooable true;
sssom:confidence 9.5E-1;
sssom:mapping_justification semapv:ManualMappingCuration;
sssom:object_label "apple (whole)";
sssom:subject_label "apple"
], [ a owl:Axiom;
pav:authoredBy ORCID:0000-0002-7356-1779;
dcterms:created "2025-07-14"^^xsd:date;
owl:annotatedProperty skos:exactMatch;
owl:annotatedSource KF_FOOD:F002;
owl:annotatedTarget FOODON:00003348;
EXT:isFooable false;
sssom:confidence 1.0E0;
sssom:mapping_justification semapv:ManualMappingCuration;
sssom:object_label "Gala apple (whole)";
sssom:subject_label "gala"
] .
Note that the two Mapping
objects are represented as blank nodes, since the
original set does not contain any record_id
slot.
Note also that (1) identifiers are serialised as CURIEs whenever possible, and
(2) the definition for the EXT:isFooable
extension is included. This means
that the set can be fully converted back to SSSOM/TSV without any loss of
information.