Update record names as reserved words/variables and new record defini…

…tion syntax (#64) * Record names as reserved_words/Variables and new record definition syntax Co-authored-by: José Valim <[email protected]> Co-authored-by: Raimo Niskanen <[email protected]>
erlang · Oct 22, 2024 · 97f8853 · 97f8853
1 parent 9a929b7
commit 97f8853
Showing 1 changed file with 251 additions and 0 deletions.
diff --git a/eeps/eep-0072.md b/eeps/eep-0072.md
@@ -0,0 +1,251 @@
+    Author: Jesse Gumm <ja(at)gumm(dot)io>
+    Status: Draft
+    Type: Standards Track
+    Created: 08-Oct-2024
+    Post-History:
+****
+EEP 72: Reserved words and Variables as record names, and enhancement to definition syntax
+----
+
+Abstract
+========
+
+This EEP loosens some of the restrictions around record names to make it no
+longer necessary to quote them when they are named with reserved words (`#if`
+vs `#'if'`) or words with capitalized first characters (terms that currently
+would be treated as variables, for example `#Hello` vs `#'Hello'`).
+
+This EEP also proposes to add a new record-like syntax to the record
+definitions (also adopting the above syntactical changes), so that the
+following record definitions would be valid and identical:
+
+```erlang
+-record('div', {a :: integer(), b :: integer()}).
+-record #div{a :: integer(), b :: integer()}.
+```
+
+The latter one is proposed new syntax.  The following would also be valid and
+identical since parentheses are optional in attributes, and since atoms may be
+quoted even when not mandatory:
+
+```erlang
+-record 'div', {a :: integer(), b :: integer()}.
+-record #'div'{a :: integer(), b :: integer()}.
+-record(#'div'{a :: integer(), b :: integer()}).
+-record(#div{a :: integer(), b :: integer()}).
+```
+
+Usage Syntax Motivation
+=======================
+
+Record names are atoms. As such, the current Erlang syntax requires the record
+names to be consistent with the rest of the language's use of atoms.
+
+All atoms in Erlang can be denoted with single quotes. Some examples:
+
+For example:
+
+```erlang
+'foo'.
+'FOO'.
+'foo-bar'.
+```
+
+But, conveniently, simple atoms (all alphanumeric, underscores (`_`) or
+at symbols (`@`) with the first character being a lowercase letter and not one
+of the 20+ reserved words), in all contexts can be invoked without the
+necessary wrapping quotes. Some examples:
+
+```erlang
+foo.
+foo_Bar.
+'foo-bar'. % still quoted since the term has a non-atomic character in it.
+```
+
+Conveniently, this also means that records named with simple atoms can be
+invoked and used without having to quote the atoms. For example:
+
+```erlang
+-record(foo, {a, b}).
+-record(bar, {c}).
+
+go() ->
+    X = #foo{a = 1, b = 2},
+    Y = X#foo{a = something_else},
+    Z = #bar{c = Y#foo.a},
+    ...
+```
+
+Unfortunately, that also means that records named with anything that doesn't
+fit the "simple atom" pattern must be wrapped in quotes in definition and
+usage. For example:
+
+```erlang
+-record('div', {a, b}).
+-record('SET', {c}).
+
+go() ->
+    X = #'div'{a = 1, b = 2},
+    Y = X#'div'{a = something_else},
+    Z = #'SET'{c = Y#'div'.a},
+    ...
+```
+
+While this approach is consistent with atom usage in the language, for reserved
+words and capitalized atoms, this makes the record syntax *feel* inconsistent if you have a need for
+naming a record with a reserved word (or term with a capital first letter). In
+this case, it almost guarantees a user won't use a record named 'if',
+'receive', 'fun', etc even though there may very well be a valid use case for
+such a name. The most common use case that comes to mind from the Nitrogen Web
+Framework. Since HTML has a `div` tag, Nitrogen (which represents HTML tags
+using Erlang records) should naturally have a `#div` record, however, due to
+'div' being a reserved word (the integer division operator), the record `#panel`
+is used instead to save the programmer from having to invoke `#'div'`,
+which feels unnatural and awkward.
+
+Further, applications such as ASN.1 and Corba both have naming conventions that
+rely heavily on uppercase record names and as such, they currently must be
+quoted as well. You can see this in modules in Erlang's
+[`asn1`](https://github.com/erlang/otp/blob/OTP-27.1.1/lib/asn1/src/asn1_records.hrl#L35-L39)
+application. (The previous link points to some record definitions in `asn1`,
+but you can see the usage scattered across a number of modules in the `asn1`
+application).
+
+Usage Syntax Specification
+==========================
+
+This EEP simplifies the above example by
+
+1. Allowing reserved words and variables to be used without quotes for record
+   names, and
+2. Simplifying the definition such that the syntax between record definition
+   and record usage becomes more consistent.
+
+With the changes from this EEP, the above code becomes:
+
+```erlang
+-record('div', {a, b}).
+-record('SET', {c}).
+
+go() ->
+    X = #div{a = 1, b = 2},
+    Y = X#div{a = something_else},
+    Z = #SET{c = Y#div.a},
+    ...
+```
+
+Definition Syntax Motivation
+============================
+
+While the updated example in the usage syntax specification makes the *using*
+of records cleaner, there remains one more inconsistency that can also be
+relatively easily solved. That is the record definition still also needing to
+quote record name, as the example above demonstrates (repeated here for
+convenience):
+
+```erlang
+-record('div', {a, b}).
+
+go() ->
+    X = #div{a = 1, b = 2},
+    Y = X#div{a = something_else},
+    Z = Y#div.a,
+    ...
+```
+
+So whereas the record definition needs to be thought of as `'div'`, the record
+usage no longer requires the quoted term 'div', which could certainly lead an
+Erlang beginner to wonder why 'div' needs to be quoted in the definition while
+other atom-looking terms don't.
+
+Definition Syntax Specification
+===============================
+
+Conveniently, there is a rather easy solution, and that's to
+allow the record usage syntax to also be used as the record definition.
+
+This EEP also then also adds a new record definition syntax, improving the
+symmetry between general record usage and record definition.
+
+The above example can fully then look like the following:
+
+```erlang
+-record #div{a, b}.
+
+go() ->
+    X = #div{a = 1, b = 2},
+    Y = X#div{a = something_else},
+    Z = Y#div.a,
+    ...
+```
+
+Implementation
+==============
+
+To update the syntax for using records, we can safely augment the parser to
+change its already existing record handling of `'#' atom '{' ... '}'` and
+`'#'atom '.' atom` into `'#' record_name '{' ... '}'` and
+`'#' record_name '.' atom`, and define `record_name` to be `atom`, `var`, or
+`reserved_word`.
+
+To update the record definition syntax, we can simply add a few new
+modifications to the `attribute` Nonterminal to allow `'#' record_name` as name
+for the `record` attribute, instead of `atom` as for generic attributes.
+
+Backwards Compatibility
+=======================
+
+As this EEP only adds new syntax, the vast majority existing codebases will
+still work, with the possible exception of AST/code analysis tools that are
+analyzing code using the new syntax.
+
+Syntax highlighting and code completion tools may need to be updated to support
+the new syntax if your code uses the new syntax rules.
+
+Broader Concerns and Points of Discussion
+=========================================
+
+While the new definition syntax creates some degree of symmetry around record
+usage, perfect symmetry is impossible to achieve, since a record can always
+be handled as the atom tagged tuple it actually is. The question is where
+to draw the line where the record's true nature shows, and how hard we
+should try to hide it. These are remaining concerns and inconsistencies:
+
+Auxiliary Record Functions
+--------------------------
+
+Other functions that work with records like `is_record/2` or `record_info/1`
+are not currently covered by any of the syntactical changes in this EEP, and as
+such, it remains necessary to quote record names if they are not simple atoms.
+For example: `is_record(X, div)` would still be a syntax error. So there is
+still not true 100% symmetry.  Note that instead of using the
+`is_record(X, 'div')` guard, matching on `#div{}` is probably more frequently
+used, since it is terser and mostly regarded as more readable.
+
+Two Definition Syntaxes?
+------------------------
+
+This EEP introducing a new syntax for record definition could potentially lead
+to some to wonder why the language has two rather different syntaxes for
+defining records. Since usage of the syntax for getting, setting, matching, etc
+(e.g. `#rec{a=x,y=b}`) occurs far more commonly than defining, it only feels
+natural that the definition syntax would mirror usage.
+
+For more symmetry, the syntax in Erlang's type system to define records also
+matches the newly proposed define syntax.
+
+Thus, I feel that sharing the existing usage and type syntax with the
+definition system would likely become the default/preferred way, and that the
+original syntax remain for backwards compatibility.
+
+Reference Implementation
+========================
+
+The reference implementation is provided in a form of pull request on GitHub
+
+    https://github.com/erlang/otp/pull/7873
+
+Copyright
+=========
+
+This document has been placed in the public domain.