An ORM MetaModel for RDF
The Atoms of Knowledge, Expanded
RDF (Resource Description Framework) can be serialized into various formats to suit different use cases. How often, though, do we focus on the metamodel of RDF? Or of and across databases, graph and relational?
Here are the most common RDF output formats:
- RDF/XML: The original and standard XML-based serialization of RDF.
- Turtle (Terse RDF Triple Language): A human-readable, compact text format.
- N-Triples: A line-based, plain-text format for RDF triples, useful for streaming and debugging.
- N-Quads: Similar to N-Triples but supports RDF datasets with graph context.
- JSON-LD: A JSON-based format for RDF, designed for web developers and linked data.
- RDFa (RDF in Attributes): Embeds RDF data in HTML or XML documents using attributes.
- TriG: A Turtle-like syntax for RDF datasets, supporting multiple named graphs.
- Notation3 (N3): An advanced, compact format similar to Turtle but with additional expressiveness.
- HDT (Header-Dictionary-Triples): A binary format optimized for storage and retrieval.
- SPARQL Results XML/JSON: Used for RDF query results, not a serialization format for RDF graphs but for query outputs.
“Each format has its strengths, depending on whether the focus is human readability, compactness, streaming, or integration with other technologies”
That’s the standard spiel, however what we are after is utility at least cost.
The utility we are after is the ability to convey knowledge and meaning, and where meaning is data in context.
If we take knowledge representation from first principals, we need to know and understand what I call The Atoms of Knowledge: Symbols and their representation within a medium, which I write about [here].
But when we speak about the transfer of knowledge, to imbue meaning, what we really need to know, if the all the formats of output of the methodology (e.g. RDF output formats as above) emanate from the same place, is that they come from a common core.
That common core, when it comes to RDF variants, is the metamodel of RDF.
RDF, Graph and Relational
Arguably, and as promoted, RDF describes data (via the implicit schema included in the RDF output) in a graph format, by way of what are known as Triples (subject, object and predicate). Triples form a directed, labelled graph (DLG).
E.g. “Alice is an employee in the Engineering department”
This maps to the triple: ex:employee/1 ex:department "Engineering"
Where:
- Subject = ex:employee/1 (representing Alice)
- Predicate = ex:department (the relationship “works in”)
- Object = “Engineering” (the department value)
However, relational data can be expressed as triples also. For instance, take the following table:
For relational data:
- Each row in a table can be represented as a set of RDF triples
- Column names become predicates
- Primary keys can become subject URIs
- Cell values become objects
- Foreign key relationships can be expressed as links between URIs
And that can be expressed as in RDF Turtle format:
This is nothing new, as the world moves to multi-model databases by default and with the introduction of the ISO-GQL Standard (International Standards Organisation: Graph Query Language), enthusing graph queries over relational databases. As I write, [here].
The world is becoming fast aware that at their core, relational and graph databases have a great deal in common, and where we ignore the implementation details of graph-native databases which may store data differently, but where conceptually we talk of the same thing.
I often point to this gif, for instance, made with FactEngine’s Boston conceptual modelling software which stores graph and relational diagrams in the same metamodel:
And all of the above achieved using the Object Role Modeling (ORM) metamodel: I.e. We can morph also to ORM diagrams, which visually approach more closely a graph view of the world, with predicate based associations by default:
The essence of this article is to propose that the ORM metamodel is suitable for generating all the various RDF output formats (and data/knowledge exchange by default).
Let us start by exploring Triples as they may be expressed in Object-Role Modeling.
Take the following Object-Role Model for example, which expresses that the concept of a Stocked Item in our Universe of Discourse is the relationship of a Part stored in a Bin in a Warehouse.
Normally you would say, “There are no triples in that diagram”
However, the triples are implied and are stored in the ORM metamodel, as Link Fact Types (the dashed associations/triples with reverse reading predicates):
In ORM this can be shown more easily by representing StockedItem as its Objectifying Entity Type, with just the triples (as FactTypes) showing the associations to Part, Bin and Warehouse.
I.e. Object-Role Modeling is more expressive than standard RDF, because it can also express the ternary relationship, “Part is in Bin in Warehouse”, which is a StockedItem.
I.e. The metamodel of ORM is suitable for storing schema information of RDF.
But ORM can also store the data component of RDF, as follows, where the Part, 123, is stored in the Bin, ‘H1’, in the Warehouse called, ‘Sydney’.
Data and Schema
It is said of RDF that it expresses both data and schema. I.e. One cannot exchange data that has meaning, without exchanging also the schema, to give the data context.
We may do this also with Object-Role Modeling, with a language format such as FactEngine Knowledge Language:
Or we might express the implied triples:
StockedItem (456) represents Part (123)
StockedItem (456) is in Bin (‘H1’)
StockedItem (456) is in Warehouse (‘Sydney’)
…if we give the Objectifying Entity Type, StockedItem, a ReferenceModel (uniquely identifying surrogate ‘key’ in effect), which is another article in the making.
Thank you for reading. As time permits I will write more on how the metamodel of Object-Role Modeling is a suitable candidate for the storage of data for use in producing RDF of most, if not all, of its output variants.
==============End==============