DLM Syntax

The openEHR Decision Language (DL) is primarily used to write Decision Logic Modules (DLMs). A DLM has the following structure.

dlm <form> <identifier>

definitions -- Descriptive

    |
    | lang and translations meta-data, as for archetypes
    |
    language = {
            original_language: [ISO_639-1::en],
            translations: {...}
        }
        ;

    |
    | description meta-data, as for archetypes
    |
    description = {...}
        ;

[use_model
    <reference model ids; defaults to openEHR Foundation & Base types>]

[use
    <local identifier>: <dlm identifier>
    <local identifier>: <dlm identifier>]

[preconditions
    <conditions that act as pre-conditions for this module>]

[definitions -- Reference
    <constant definitions>]

input -- Administrative State
    <administrative subject variables>

input -- Historical State
    <historical subject variables>

input -- Tracked State
    <real-time subject variable>

rules -- Conditions
    <conditions expressed as EL functions>

rules -- Main
    <computational rules expressed as EL functions>

rules -- Output
    <output rules expressed as EL functions>

definitions -- Terminology
    terminology = {...};

Many of the sections are optional and are only used in specific forms of DLM, indicated by the <form> on the first line.

The definitions — Descriptive and definitions — Terminology sections (which are logically identical data structures to the same-named sections in openEHR archetypes, but in a JSON5 superset form) can be omitted for DLMs in development for which no descriptive meta-data or terminology is needed.

Identifiers

All identifiers in a DLM are strings with no whitespace. If the terminology section is present, these identifiers are included with linguistic definitions and translations, in the same way as archetypes. This approach enables the symbolic variables such as is_type1_diabetic to be used in rules, and to have a full, formal, multi-lingual definition, such as the following:

definitions -- Terminology

    terminology = {
        term_definitions: {
            "en" : {
                "resting_heart_rate" : {
                    text: "heart rate at rest",
                    description: "..."
                }
                ...
                "SpO2": {
                    text: "Oxygen saturation";
                    description: "..."
                }
            }
        }
    }

Identification Section

The identification section takes the form dlm <form> <identifier>, where:

<form>: flavour of DLM artefact, with value ruleset | guideline;
<identifier>: identifier, in the format <concept>.vN.N.N where vN.N.N represents a numeric version of the logical form major.minor.patch, and whose values over time obey the Semver.org rules.

The following shows the identification line for a guideline:

dlm guideline chadvasc2.v1.0.5

A reference to a DLM may be effected by quoting the whole identifier, or a form with the minor and / or patch parts of the version missing. Either of these forms identify the most recent matching DLm available. For example, the reference NHS-chops14.v4 identifies whatever the most recent minor version of the v4 major version of the NHS-chops DLM is locally available.

Language and Description Sections

The language and description sections are the same as for openEHR archetypes, and are formally defined by the openEHR Resource specification. The following provides an example.

definitions -- Descriptive

    language = {
            original_language: [ISO_639-1::en]
        }
        ;

    description = {
            lifecycle_state: "unmanaged",
            original_author: {
                name:           "Thomas Beale",
                email:          "thomas.beale@openEHR.org",
                organisation:   "openEHR Foundation <http://www.openEHR.org>",
                date:           "2021-01-10"
            },
            details: {
                "en" : {
                    language: [ISO_639-1::en],
                    purpose: "To record an individual's QRISK3 score."
                }
            },
            copyright:  "© 2021 openEHR Foundation",
            licence:    "Creative Commons CC-BY <https://creativecommons.org/licenses/by/3.0/>",
            ip_acknowledgements: {
                "ClinRisk" : "This content developed from original publication of
                    © 2017 ClinRisk Ltd., see https://qrisk.org",
                "QRISK" : "QRISK® is a registered trademark of the University of Nottingham and EMIS"
            }
        }
        ;

Use_model Section

The optional use_model section enables a DLM to specify a model that defines the type system for the DLM. The identifier of the model must be resolvable to a BMM model, i.e. a model for which an openEHR Basic Meta-Model file is available. This might be a well-known model such as the openEHR Reference Model (BMM form), or a custom model created for local use. The model identifier must include a version matching part, with at least the major version present. The following example specifies the use of the openEHR RM in the latest 1.1 version available.

use_model
    openEHR-RM.v1.1

The types defined in all 'used' models become available to the DLM for use in declarations. If no model is specified, the openEHR Base and Foundation types are assumed.

Use Section

A DLM can use other DLMs, by declaring each DLM with an identifier in the use section. The identifier must include a version-matching part. The following example declares the identifier BSA as a convenient local label to refer to the latest version of the DLM Body_surface_area.v1.

use
    BSA: Body_surface_area.v1

Preconditions Section

The preconditions section is used to state logical conditions that must evaluate to true for the DLM to be used for the subject. Any Boolean-typed identifier may be used from the input, condition or rules sections, or any Boolean-returning logical expression referencing any identifiers declared in the DLM. A typical preconditions section is shown below:

preconditions
    is_pregnant

Reference Section

The reference section of a DLM contains what might be thought of constant definitions, i.e. identifiers declared with fixed values. The following illustrates.

definitions -- Reference

    paracetamol_dose: Quantity = 1g;
    chlorphenamine_dose: Quantity = 10mg;
    prednisolone_dose_per_m2: Quantity = 40mg;

These may be data structures of significant complexity, such as the following risk tables.

definitions -- Reference

    Snoking_interaction_scales = {
            [female]: {
                 [age_1]:   {
                    [non_smoker]:                   0,
                    [ex_smoker]:                   -4.70571617858518910,
                    [light_smoker]:                -2.74303834035733370,
                    [moderate_smoker]:             -0.866080888293921820,
                    [heavy_smoker]:                 0.902415623697106480
                },
                [age_2]:   {
                    [non_smoker]:                   0,
                    [etc]:                         -0.0755892446431930260000000
                }
            },
            [male]: {
                 [age_1]:   {
                    [non_smoker]:                   0,
                    [etc]:                         -0.2101113393351634600000000
                },
                [age_2]:   {
                    [non_smoker]:                   0,
                    [etc]:                         -0.0004985487027532612100000
                }
            }
        }
        ;

Input Section

Subject Variables

The input section contains declarations of all subject variables used by a DLM. At a minimum, a subject variable declaration states the symbolic name and type of the variable in the manner typical of a typed programming language, as exemplified by the following:

input
    heart_rate: Quantity

Although a subject variable declaration appears to declare a simple property of a type such as a Quantity, in fact it creates an instance of a proxy object described below in [_dlm_model], that provides access to snapshot values of the variable over time, as well as other smart facilities including null-value detection and range conversion, described in the sections below.

Variable Naming

The naming of a subject variable is important, and should reflect its intended domain meaning with respect to the guideline or plan which it formalises. Thus, a cardiology guideline might use a variable systolic_bp to mean 'current instantaneous systolic blood pressure' and a variable target_systolic_bp to mean a target pressure for the patient to aim for over the course of hypertension treatment. However a guideline that refers to different systolic blood pressures, e.g. historical, average and current might use variables such as actual_systolic_bp, 24h_average_systolic_bp etc.

The naming is important in another way. Generally a subject variable should reflect a fact or assertion about the subject in reality rather than a purely epistemic view relating to an information system. For example a variable is_type1_diabetic is intended to reflect the patient’s real diabetic status, not just the knowledge of the local hospital EMR system of whether the patient is diabetic. Such variables may be termed 'ontic' i.e. reflecting the real world, rather than reflecting states of knowledge of some information source. The reason for using ontic variables is to allow DLM authors to define rules in terms of true clinical reality based on reliable previously established facts, rather than continually having to compensate for missing or unreliable knowledge within a guideline.

Epistemic variables may of course be defined, e.g. the variable has_diabetes_diagnosis directly reflects the idea that the presence of a diagnosis of a condition is distinct from the true fact of having the condition. These are typically used when the purpose of the guideline is to establish the presence or otherwise of the condition named in such variables.

Unavailable Values

One of the facilities created by a declaration of the form identifier: Type are subordinate predicate functions to detect if a value is available for the variable, i.e. if it is not logically null. Lack of a value is caused either by the true absence of the data in back-end systems (e.g. blood pressure has not recently been measured) or a technical failure to query or otherwise interrogate the relevant system.

It should be noted that within the overall conceptual model of process-based computing in openEHR, the common problem of a failure to locate a data item in back-end systems causing a live user to be asked to supply it is assumed to be addressed outside the DLM itself. This means that the simple lack of a value in back-end systems does not need to be compensated for by logic (such as null checks) in a DLM itself - it will already have been done in the Subject Proxy service. Consequently, if a variable value is unavailable within a DLM computation, this already takes into account attempts to obtain a value from a user.

The general case is that any subject variable might not have a value available for it, or at least a sufficiently current value (see next section for the concept of currency) at the moment of a particular rule invocation (remembering that the same rule might be invoked repeatedly over time). This means that the simple (primary variable) reference systolic_blood_pressure may in fact return a null value. If rules containing primary variable references such as systolic_blood_pressure are written under a non-null assumption, a null value will cause an exception of type no available value, and the original rule invocation will fail.

In most cases this is likely to be the preferred style of rule expression, since it makes rules simpler and clearer. However, in some cases, it may be known a priori that certain variables are only sometimes likely to be available, and if so, they are used, but if not, no exception is generated. This may be achieved by calling the subordinate predicate is_available as a guard on the direct access, as follows.

rules

    is_hypertensive:
        Result := systolic_blood_pressure.is_available and then systolic_blood_pressure.in_range([high]) or  ...

In the above, the semi-strict Boolean operator and then ensures the second reference to systolic_blood_pressure will only be evaluated if systolic_blood_pressure.is_available returns True.

Tracked Variables

An important category of subject variable is the tracked variable, which enables the current value of a variable, with some lag, to be obtained automatically from the back-end system. The acceptable lag is indicated by the subordinate currency attribute, as a temporal duration. This specifies how recent the value obtained from the external world (the 'sample') must be to be valid from the point of view of the DLM. Currency may be understood as the converse of 'staleness', that is, a variable sample that must be say 1 hour or less old is understood as stale after 1 hour.

The use of the currency modifier establishes that a subject variable is a time-related sample of some kind (instantaneous, average, minimum, etc) of a real-world time-varying continuant quality (e.g. blood pressure) of an independent continuant entity (usually a person).

Since the various physiological and disease process that occur in a human body have significantly differing temporal rhythms, currency will vary widely for different subject variables, as per the following examples.

input -- Administrative State

    |
    | untracked variable:
    | DOB never changes, no currency needed
    |
    date_of_birth: Date
        ;

input -- Historical State

    |
    | tracked variable:
    | weight changes over a period of days
    |
    weight: Quantity
        currency = 3 days
        ;

    |
    | untracked variable:
    | assuming an adult subject, height constant
    |
    height: Quantity
        ;

input -- Tracked State

    |
    | tracked variable:
    | blood glucose changes within minutes in response to food
    |
    blood_glucose: Quantity
        currency = 15 min
        ;

    |
    | tracked variable:
    | Heart-rate may change quickly
    |
    heart_rate: Quantity
        currency = 5 sec
        ;

Variables for which no currency is stated may be understood as having the currency equal to the age of the subject.

Quantitative Tracked Variables

Many tracked variables are quantitative, and a ubiquitous need within clinical guidelines and rules is to be able to refer to a continuous variable such as vital signs and most lab test values as being in a designated range. Such ranges may be the usual ones published e.g. the normal and high ranges for lipids in a cholesterol test for adults, or ranges defined by the DLM.

TBD: could ranges be declared elsewhere and re-used, e.g. standard BMI ranges, BP ranges etc?

The ranges for a subject variable are declared in the following way:

input -- Tracked State

    |
    | Systolic BP (mmHg)
    |
    systolic_blood_pressure: Real
        currency = 1 min
        ranges["mmHg"] =
            ------------------------------------
            |≥ 180|:            [critical_high],
            |> 140|:            [very_high],
            |> 120|:            [high],
            |> 90 .. ≤ 120|:    [normal],
            |≤ 90|:             [low],
            |≤ 50|:             [critical_low]
            ------------------------------------
        ;

The above declaration allows the use of the predefined range function, which returns the most precise range in which the value falls, in rule expressions such as the following:

    Result :=
        case systolic_blood_pressure.range in
            =====================================
            [critical_high]:         [emergency],
            -------------------------------------
            [high], [very_high]:     [high_risk],
            -------------------------------------
            *:                       [monitor]
            =====================================
        ;

Note that ranges may be defined in two ways:

to be overlapping, such that [high] refers to any value higher than 120, while [very_high] refers to any value over 140;
non-overlapping, such that each sub-range stops at the start of the next one.

In the latter case, determining that a patient has high blood pressure requires an expression of the form systolic_blood_pressure.in_range ({[high], [very_high], [critical_high]}), using a set argument.

Sometimes there are multiple ranges, usually due to alternative units systems. This is handled by the use of a discriminator on a number of range groups. The following shows an example.

input -- Tracked State

    PaO2_FiO2_ratio: Quantity
        currency = 1 min,
        ranges["mm[Hg]"] =
            ---------------------------------
            |≥400 |:        [normal],
            |300 .. 399|:   [low],
            |200 .. 299|:   [very_low],
            |100 .. 199|:   [extremely_low],
            |<100|:         [critical_low]
            ---------------------------------
        ;,
        ranges["kPa"] =
            ---------------------------------
            |≥53|:          [normal],
            |39.9 .. 53|:   [low],
            |26.6 .. 39.8|: [very_low],
            |13.3 .. 26.5|: [extremely_low],
            |<13.3|:        [critical_low]
            ---------------------------------
        ;

To access a range within this structure, an expression of the form systolic_blood_pressure.in_range (["kPa"], [high]) is needed. This makes use of an overload of the in_range() function taking two keys. We can now see that the simpler form systolic_blood_pressure.in_range ([critical_high]) first chooses the first range group, and then extracts the required entry.

The above case reveals the general structure of the ranges property, with a formal type equivalent to Map <Terminology_code, Map <Interval <T>, Terminology_code>, where T is the declared type of the subject variable. In both the simple and discriminator cases, the entry |> 140|: [very_high] defines a [key, value] pair whose key is of type Interval <Real> and whose value is of type Terminology_code.

Rules Section

The DLM rules section is the section of primary importance, since it contains the rules for which a DLM is created. DLM rules are formally expressed as functions in the openEHR Expression Language, based on the openEHR BMM.

This section may be typically divided into two or more groups for authoring convenience. The first group may be used for simple Boolean-returning 0-order functions that represent 'named conditions', for use in the primary rules. The Boolean type may be omitted, since all conditions have this as their formal type. The following is an example.

rules -- Conditions

    her2_positive:
        Result := her2_expression = [positive]
        ;

    non_class_I_heart_failure:
        Result := has_heart_failure_class_II or
                    has_heart_failure_class_III or
                    has_heart_failure_class_IV
        ;

    anthracyclines_contraindicated:
        Result := has_transmural_MI or
            ejection_fraction.in_range ([low]) or
            non_class_I_heart_failure
        ;

The primary rules may be included in a separate rules group, consisting of 0-order functions returning any type. EL structures of any complexity may be used. The following provides an example.

rules -- Main

    hypertension_risk: Terminology_term
        Result :=
            choice in
                =================================================
                has_pre_eclampsia or
                has_eclampsia:                      [emergency],
                -------------------------------------------------
                previous_obstetric_hypertension or
                previous_pre_eclampsia or
                previous_eclampsia or
                has_pregnancy_hypertension:         [high_risk],
                -------------------------------------------------
                *:                                  [low_risk]
                =================================================
            ;

    gestational_diabetes_risk: Boolean
        Result :=
            bmi.in_range ([high]) or
            previous_macrosomic_baby or
            previous_gestational_diabetes or
            family_history_of_diabetes or
            race_related_diabetes_risk
        ;

Output Section

TBD:

Terminology Section

The definitions — Terminology section of a DLM serves the same purpose as the terminology section in an openEHR archetype, which is to provide multi-lingual definitions of all codes used in the artefact. Unlike archetypes, the codes in a DLM may be freely named, since they act as names of all symbolic entities referenced elsewhere in the DLM, including rules (i.e. functions), subject variables and constants. A typical DLM terminology section is shown below.

definitions -- Terminology

    terminology = {
        term_definitions: {
            "en" : {
                "date_of_birth" : {
                    text: "Date of birth"
                },
                "age_in_years" : {
                    text: "Age (years)"
                },
                "qRisk_score" : {
                    text: "QRISK2 score"
                },
                "diabetes_status" : {
                    text: "Diabetic status of subject"
                },
                "no_diabetes" : {
                    text: "Non-diabetic"
                },
                "type1_diabetes" : {
                    text: "Has type 1 diabetes"
                },
                "type2_diabetes" : {
                    text: "Has type 2 diabetes"
                }
            }
        }

        value_sets: {
            "diabetes_status" : {
                id: "diabetes_status",
                members: ["no_diabetes", "type1_diabetes", "type2_diabetes"]
            }
        }
    }
    ;