Our legal system is based on documents; written in natural language (English, French, Japanese, German etc.). For agreements/contracts to be legally binding it is therefore imperative that the structured data within agreements is embedded within arbitrary natural language.
Note that for the purposes of this document we use the terms agreement, contract and document interchangeably... They all represent a human/legal natural language context for a set of variable values.
- To promote a vendor and user ecosystem centered around agreements and templates
- To facilitate a standard REST API for agreements and templates
- To promote a vendor ecosystem of agreement and template management and editing tools
- To future-proof templates and allow templates to be migrated between vendors
- To provide a "pivot" data format that can be imported from existing document formats (HTML/OOXML/ODF/PDF etc.) or converted to existing document formats (HTML/OOXML/ODF/PDF etc.)
Note that this document describes TWO (related) JSON data formats (data models):
- Template Format (aka TemplateMark)
- Agreement Format (an agreement is an instance of a template) (aka AgreementMark)
Given that agreements and agreement templates are intended for both human and machine consumption it is useful to be able to represent agreements using a human readable/editable/diffable format, (eg extended markdown) as well as a machine readable format (eg JSON), with an isomorphic transformation to move between these.
First, let’s start by defining the semantic elements of an agreement.
Note: both AgreementMark and TemplateMark are specified as Concerto data models and use Concerto JSON serialization.
The root node of an agreement is the Document
node. A document node contains a set of child nodes.
At the most basic level agreements are expressed as plain text: sequences of Unicode characters, along with markers to escape control characters.
Text:
This is plain text.
JSON AST:
{
"$class": "org.accordproject.commonmark.Document",
"nodes": [
{
"$class": "org.accordproject.commonmark.Paragraph",
"nodes": [
{
"$class": "org.accordproject.commonmark.Text",
"text": "This is plain text."
}
]
}
]
}
The AST above defines a document containing a single paragraph, which contains a simple plain text string "This is plain text."
.
TBD: need to update these examples to use versioned namespaces.
Agreements contain embedded variable values (aka “deal points”). It is useful to distinguish variable values from plain text so that variable values can be highlighted, or displayed in summary views.
In the example below, the variable values have been highlighted in bold:
Basic Compensation. (A) SALARY. The Executive shall be paid an annual salary of $230,000.00, subject to an adjustment as provided below (the "Salary"), which will be payable in equal periodic installments according to the Employer's customary payroll practices, but not less frequently than monthly. The Salary will be reviewed by the Board of Directors not less frequently than annually and may be adjusted upward or downward in the sole discretion of the Board of Directors.
Agreements may contain values that are calculated from variable values (the results of applying a formula to variable values). In the example below the variable values are highlighted in bold, and the formula value is highlighted in italic.
Fixed rate loan This is a fixed interest loan to the amount of £100,000.00 at the yearly interest rate of 2.5% with a loan term of 15 years, and monthly payments of £667.00
Agreements do not typically require the full layout and formatting features of a word processor, however control of the presentation of the text may be required or desirable for some applications. This may include:
- Bold (strong)
- Italic (emphasis)
- Underline? (BUT read this, and this for a discussion on semantics vs presentation!)
- Superscript?
- Subscript?
- Strikethrough
In addition the font rendering for elements of the agreement should be specifiable via a theme or stylesheet, including setting:
- Font Face
- Font Color
- Font Size
It is not typically necessary (or desirable) to be able to override font choice within the document itself — instead this should be defined via a CSS-like mechanism that binds semantic elements to display properties. E.g. All H1s should be displayed in Serif Font, Size 32, Bold, Pink.
Agreements typically make extensive use of headings to organize content. Headings are hierarchical: H1, H2, H3, H4, H5, H6 (from markdown) being very common.
A clause is an identified part of an agreement; and can be considered a container for a set of paragraphs.
TBD. Is this the same as a "section"?
Some pages may be tagged as an appendix.
TBD. More details required.
Agreements usually require page numbers in footer/header and may sometimes require supplemental information, such as confidentiality, status, file name, logo etc, though this is somewhat controversial.
TBD how to model this.
Text is typically organized into a set of sequential paragraphs, separated by whitespace.
Whitespace is not typically semantic, instead whitespace is inserted via explicit line-breaks, paragraphs, or page breaks (“thematic breaks”).
Agreements often include images, pictures or diagrams to illustrate products, instructions or business processes. Images should be referenced within the text of the agreement.
Note: There are security (CORS) issues with referencing external images from web apps.
See: https://spec.commonmark.org/0.30/#images
Agreements frequently contain hyperlinks to external content.
See: https://spec.commonmark.org/0.30/#links
Agreements often include explicit page breaks to force content onto a new page, to improve readability or organization. How “pages” are rendered on a mobile device or in a responsive web-browser will vary based on UX considerations. Note that the CommonMark specification therefore refers to these as “thematic breaks”, rather than “page breaks” — because the intent is to insert a switch of theme, rather than necessarily control a printer head and the motors controlling paper advance.
Most agreements include lists in some form, and some contracts are purely composed of lists and nested lists, ensuring that every element is identified via a unique list path, such as element “1.3.4”, for example.
Lists numbering may be specified as roman, numeric, alphanumeric or unordered (bullets).
Many agreements include tables, used to display sets of related data in compact form. For example, for product details, contact details, pricing tables, discount tables, or compensation amounts.
Agreement text often includes references to semantic elements from within the same agreement, or to other agreements, contracts or documents. Examples:
- As defined in Warranty Clause
- See pricing defined in section 2.3.5
- Defined in Appendix 3
- As specified in image 4.
See https://spec.commonmark.org/0.30/#link-reference-definitions
It is useful to be able to wrap semantic elements in an annotation, to indicate for example that a sentence represents high risk, or that a paragraph appears to be an instance of a type of clause. In many cases these annotations would be the result of running a machine-learning algorithm over the source text of the agreement.
Other types of annotations could include comments from human reviewers.
Now, let us turn to the format for an agreement template. Agreement templates are a superset of agreements, in that a (degenerative) agreement template is simply a hardcoded agreement, absent all variables.
There are many commercial products for producing agreements from templates (see Additional Resources for a selection). Most are broadly similar in terms of features and semantics, however there is no universally adopted system for exchanging templates, which has resulted in a fragmented eco-system and issues of vendor lock-in, and lack of future-proofing the (considerable) investment required to build templates.
A template is expressed as natural-language with embedded variables and template expressions. Template variables are named and rely on an associated Concerto model (type-system).
For example, the trivial template:
Hello {{firstName}}!
Has a single named variable firstName
and an associated template model that specifies that firstName
is of type String
:
concept TemplateModel {
o String firstName
}
The template model is critical in ensuring that variable values conform to the model, as well as providing the basis for downstream processing of agreement data, charts, reporting and a guided user-experience when supplying variable values.
Note: This example syntax/semantics for tempates is heavily inspired by Handlebars, so you may want to review that.
An explicit goal of the agreement template format is that we should be able to statically validate (potentially compile) the logic of a template, ensuring that all variable references and formula usage is valid and that if the template is presented with valid data it is guaranteed to produce valid output. This contrasts with many current template technologies which fail in unpredictable ways, due to bugs in the template syntax/expressions, or invalid assumptions about the shape of incoming data. For example, if a contract attribute has been marked as optional the template must include appropriate guards to handle cases where the contract attribute is missing.
Most agreement templates require conditional sections, i.e. sections that should only be included based on the values of variables, for example, including a clause specific to an extended warranty.
Conceptually, these are of the form:
{{#if data.variable==”value”}}
Insert this content
{{else}}
Otherwise include this content.
{{/if}}
For more complex multi-valued inclusion of content a switch
statement is useful:
{{#switch data.color}}
{{case Color.RED}}
Insert this content for red.
{{/case}}
{{case Color.BLUE}}
Include this content for blue.
{{/case}}
{{default}}
Otherwise include this content.
{{/default}}
{{/if}}
In advanced scenarios it is not practical to package all content within a single template, because content needs to be managed, shared and reused across many templates. In those scenarios the template author should dynamically resolve and include those sub-templates, often based upon variable values that are in scope.
The example below is dynamically including agreement templates from a logical template store called hr_clauses
where the jurisdiction_name
property of the template is equal to the jurisdiction
variable value, and the status
property equals the string “active”
.
The query string is a SQL-like dialect to select templates based on their properties, comparing their properties with hardcoded values, or with variable values in context.
{{#insert hr_clauses where jurisdiction_name=data.jurisdiction and status=”active”}}
Note: see Handlebars "Partials"
Formulae allow the template author to include a dynamically calculated value, calculated from agreement data values.
In the example below an inline JS expression is being called. The JS expresson is evaluated, and the expression return value is inlined into the AgreementMark document.
Hello {{data.firstName}}{{#if data.lastName && data.lastName !== 'Selman'}} {{data.lastName}}{{/if}}!
Thank you for visiting us {{%const difference = now.getTime() - data.lastVisit.getTime();return Math.ceil(difference / (1000 * 3600 * 24));%}} days ago.
TBD. Can external libraries or functions be called?
Simple unary variables are included in templates using a navigation syntax, allowing the template author to navigate through complex types to primitive properties.
The seller {{data.seller.name}} hereby agrees to sell {{data.goods}} to {{data.buyer.name}}
Text:
Hello {{firstName}}!
Model:
namespace test
@template
concept TemplateModel {
o String firstName
}
JSON AST:
{
"$class": "org.accordproject.commonmark.Document",
"xmlns": "http://commonmark.org/xml/1.0",
"nodes": [
{
"$class": "org.accordproject.templatemark.ClauseDefinition",
"name": "top",
"elementType": "test.TemplateModel",
"nodes": [
{
"$class": "org.accordproject.commonmark.Paragraph",
"nodes": [
{
"$class": "org.accordproject.commonmark.Text",
"text": "Hello "
},
{
"$class": "org.accordproject.templatemark.VariableDefinition",
"name": "firstName",
"elementType": "String"
},
{
"$class": "org.accordproject.commonmark.Text",
"text": "!"
}
]
}
]
}
]
}
A variable may be associated with a recipient, indicating that the variable value must be supplied by a specific role (user) of the template. For example, a recipient role might be "buyer" and another role might be "seller", with variables for buyer address
and seller address
with the respective recipient association.
Properties that are not strings may optionally be formatted. For example, monetary amounts may include currency code, currency symbol, floating point numbers may include precision, dates/times are converted to human readable localized display strings.
For example, the syntax below formats a monetary amount using a currency symbol and two significant digits or numeric precision.
The agreed purchase price of the goods is {{data.purchasePrice as “K00.00”}}
Nary variables (lists) must be expanded/iterated to be inserted into a template. A list variable can be either expanded as a list or as a table.
Patient {{data.patient.name}} declares the following allergies:
{{#each data.patient.allergies}}
{{this.name}}
{{/each}}
Patient {{data.patient.name}} declares the following allergies:
| Allergy Name | Severity |
|--------------|----------|
{{#each data.patient.allergies}}
| {{this.name}}|{{this.severity}}|
{{/each}}
TBD: Include a sample that shows conditional inclusion of rows?
The first seller {{data.seller[0].name}} hereby agrees to sell {{data.goods}} to the first buyer {{data.buyers[0].name}}
The with-helper
allows you to change the evaluation context of template-part. In the example below the reference to data.person.firstname
and data.person.lastname
is simplified by first binding data.person
to the current context.
{{#with data.person}}
{{firstname}} {{lastname}}
{{/with}}
Dynamic content may be included in a template by using a formula that returns formatted text, or even a template.
Details TBD, but likely if the formula function returns a JSON Object (AST) rather than a primitive then we can assume it is a template or rich-text content
- CommonMark
- Accord Project CiceroMark
- Accord Project TemplateMark
- Confluence Wiki markup
- Pandoc
- Open Office XML
- Open Document Format 1.2
- Open Document OpenFormula 1.2
- Formula.js
Pull requests with additional vendors welcome!