Semantics of openEHR and ISO 13606 extracts
Versioning Semantics
Although for most clinical situations, it is the latest versions of Compositions which are sent to a receiver, there are requirements for various amounts of version-related information to be included, as described in Requirements on page 8. At a minimum, Compositions always include the audit trail corresponding to the particular version which the Composition represents. In some cases, historical versions of a logical Composition are needed for some medico-legal reason. It may even be required that the receiver system wants to reconstruct a complete facsimile of the versioned object, logically identical to its form at the source (but most likely stored in a different versioning implementation). The openEHR extract specification defines the simplest means of satisfying these needs, namely to include all Compositions in their whole form, including in the case where they are successive versions of a single logical Composition such as "family history", as illustrated in the figure below. The main justification for this is that no assumptions should made on sender or receiver systems to do with their ability to represent or efficiently process versions. Whole Compositions can always be processed by even the simplest systems.
It is assumed that any system that wants to be able to determine things such as who was responsible for changing a certain fragment of a Composition, when some part of a Composition came into being, or the differences between two particular versions of a Composition, must have version control capability locally. This usually means having some implementation of a version control model such as the one described in the openEHR Common Reference Model, which can do efficient versioning, differencing and so on. Supplying Compositions in their full form ensures that no assumption is made on what such an implementation might be.
This approach is a departure from the ISO 13606-1:2008 EHR Extract standard, which defines Compositions so as to include revision history information on every node of the structure. Although it is not stated in the 13606 specification whether the 'Composition' is in fact supposed to be understood as a copy of a Composition from an EHR, or as a 'cumulative diff' of Composition versions in an EHR, analysis shows that only the latter can make sense because the Composition (Composition) is the unit of creation and modification, and there is logically only one audit trail for each version. Even the 100th version has associated with it only one audit trail.
This raises the question of whether a 'diff' form of Compositions should be used in the openEHR Extract, conforming to the ISO standard. The approach was not chosen for a number of reasons:
-
it implies that senders can generate 'diff' information structures and that receivers can process them, i.e. it makes more assumptions than necessary about the sophistication of systems;
-
the ISO specification appears to be in error with respect to deletions - the sending of logical deletions does not appear to be handled properly;
-
the sending of deletions is not normally desired, and may be illegal (e.g. in Europe there are EC directives preventing the sending of statements corrected by clinicians or patients).
It is worth contemplating just how complex cumulative difference information would be. The following figure illustrates the structure generated by the accumulation of only three changes shown in the successive versions in the figure below. The large numbers of changes likely in persistent Compositions will generate far more complex structures.
In conclusion, while sending a difference form of Compositions is not out of the question in a future when EHR systems are routinely capable of sophisticated version handling, it is considered too complex currently, and the controls over sending deleted information have not been sufficiently well described.
Creation Semantics
The following describes an algorithm which guarantees the correct contents of an EHR extract. The input to the algorithm is:
-
the list of EHR Compositions required in the extract (the "primary" Composition set);
-
optionally a folder structure in which the Compositions are to be structured in the extract;
-
the
include_multimediaflag indicating whetherDV_MULTIMEDIAcontent is to be included inline or not; -
the follow_links attribute indicating to what depth
DV_LINKreferences emanating from Compositions should be followed and the Compositions containing the link targets also included in the extract.
The algorithm is as follows.
-
Create a new
EHR_EXTRACTincluding the folder structure; -
Create a demographics
EXTRACT_CHAPTERand write thePARTYsin; -
For each Composition in the original set, do:
-
create an
X_VERSIONED_COMPOSITION, and setis_primary; -
for each instance of
OBJECT_REFencountered (e.g.PARTY_REF), obtain the target of the reference from the relevant service, and copy it to the appropriate chapter, e.g. demographics, access_groups tables with the key = theOBJECT_REF.id; -
copy/serialise the Composition into the appropriate place in the folder structure rewriting its
OBJECT_REFsso thatnamespace= "local" -
for each instance of
DV_MULTIMEDIAencountered, include or exclude the content referred to by the uri or data attributes, according to theinclude_multimediaflag; -
according to the value of
follow_links, for each instance ofDV_LINKencountered (only from/to Archetyped entities):
-
-
follow the links recursively. For each link: create an
X_VERSIONED_COMPOSITION; setis_primary= False, write the path and write the target Compositions in the extract if not already there; -
create the
DV_LINKobjects so that their paths refer correctly to the Compositions in the Extract;
TBD: do something about Access_control objects;