-
Notifications
You must be signed in to change notification settings - Fork 67
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #62 from dszoboszlay/non-skipping-generators
Add proposal for non-skipping generators
- Loading branch information
Showing
1 changed file
with
151 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,151 @@ | ||
Author: Dániel Szoboszlay <[email protected]> | ||
Status: Draft | ||
Type: Standards Track | ||
Created: 01-Jul-2024 | ||
Erlang-Version: 28 | ||
Post-History: | ||
**** | ||
EEP 70: Strict and relaxed generators | ||
---- | ||
|
||
Abstract | ||
======== | ||
|
||
This EEP proposes the addition of a new, *strict* variant of all | ||
existing generators (list, bit string and map). Currently existing | ||
generators are *relaxed*: they ignore terms in the right-hand side | ||
expression that do not match the left-hand side pattern. Strict | ||
generators on the other hand shall fail with exception `badmatch`. | ||
|
||
Rationale | ||
========= | ||
|
||
The motivation for strict generators is that relaxed generators | ||
can hide the presence of unexpected elements in the input data of a | ||
comprehension. For example consider the below snippet: | ||
|
||
[{User, Email} || #{user := User, email := Email} <- all_users()] | ||
|
||
This list comprehension would filter out users that don't have an email | ||
address. This may be an issue if we suspect potentially incorrect input | ||
data, like in case `all_users/0` would read the users from a JSON file. | ||
Therefore cautious code that would prefer crashing instead of silently | ||
filtering out incorrect input would have to use a more verbose map | ||
function: | ||
|
||
lists:map(fun(#{user := User, email := Email}) -> {User, Email} end, | ||
all_users()) | ||
|
||
Unlike the generator, the anonymous function would crash on a user | ||
without an email address. Strict generators would allow similar | ||
semantics in comprehensions too: | ||
|
||
[{User, Email} || #{user := User, email := Email} <:- all_users()] | ||
|
||
This generator would crash (with a `badmatch` error) if the pattern | ||
wouldn't match an element of the list. | ||
|
||
The proposed operators for strict generators are `<:-` (for lists | ||
and maps) and `<:=` (for bit strings) instead of `<-` and `<=`. This | ||
syntax was chosen because `<:-` and `<:=` somewhat resemble the `=:=` | ||
operator that tests whether two terms match, and at the same time keep | ||
the operators short and easy to type. Having the two types of operators | ||
differ by a single character, `:`, also makes the operators easy to | ||
remember as "`:` means strict." | ||
|
||
Alternate Designs | ||
================= | ||
|
||
There are numerous other ways of achieving the goal of crashing upon | ||
encountering non-matching input: | ||
|
||
1. Converting the comprehension to a `map` call, just as in the example | ||
of the previous section. | ||
|
||
2. Moving the pattern matching that may fail to the result expression of | ||
the comprehension: | ||
|
||
[begin | ||
#{user:= User, email := Email} = Usr, | ||
{User, Email} | ||
end | ||
|| Usr <- all_users()] | ||
|
||
3. Moving the pattern matching that may fail to a filter in the | ||
comprehension (which would still have to evaluate to a boolean, | ||
making this solution quite awkward looking): | ||
|
||
[{User, Email} | ||
|| Usr <- all_users(), | ||
(#{user := User, email := Email} = Usr) > 0] | ||
|
||
4. The same as before, but using binders proposed in [EEP 12][]: | ||
|
||
[{User, Email} | ||
|| Usr <- all_users(), | ||
#{user := User, email := Email} = Usr] | ||
|
||
Most of these solutions are much more verbose than the strict | ||
generator syntax, undermining the compactness of comprehensions. But | ||
there are more serious issues too: | ||
|
||
* 1 would not work for bit string comprehensions, because there isn't | ||
a `map` function available for bit strings. | ||
|
||
* 2 would make it impossible to use sub terms of the pattern (`Usr` in | ||
this example) in subsequent generators or filters. | ||
|
||
* The intent of the code isn't clear in most of these solutions | ||
(definitely not in 3, and arguably in 4 and 2). | ||
|
||
* None of these techniques can solve the problem of final non-matching | ||
bits of a bit string. | ||
|
||
The issue with bit string generators is that unlike lists and maps, bit | ||
strings don't have natural "elements" the generator could iterate over. | ||
Instead the pattern in the generator dictates where to split the bit | ||
string into parts. This could inevitably lead to situations where the | ||
bit string ends with some bits that don't match the pattern, like in | ||
this example: | ||
|
||
[X || <<X:16>> <- <<1,2,3>>] | ||
|
||
The existing generator would skip these final non-matching bits, and | ||
this behaviour cannot be changed due to backward compatibility reasons. | ||
It is only possible to guarantee that a bit string generator would fully | ||
consume its input by introducing a new type of bit string generator or | ||
some other new syntax that would alter the behaviour of the existing | ||
generator. | ||
|
||
Reference Implementation | ||
======================== | ||
|
||
Current implementation: [PR #8625][]. | ||
|
||
Backward Compatibility | ||
====================== | ||
|
||
There are no backward compatibility issues with the proposed new syntax, | ||
since the `<:-` and `<:=` operators used to be invalid syntax in | ||
previous versions of Erlang. | ||
|
||
[PR #8625]: https://github.com/erlang/otp/pull/8625 | ||
"Reference implementation PR" | ||
|
||
[EEP 12]: eep-0012.md | ||
"Extensions to comprehensions, O'Keefe" | ||
|
||
Copyright | ||
========= | ||
|
||
This document is placed in the public domain or under the CC0-1.0-Universal | ||
license, whichever is more permissive. | ||
|
||
[EmacsVar]: <> "Local Variables:" | ||
[EmacsVar]: <> "mode: indented-text" | ||
[EmacsVar]: <> "indent-tabs-mode: nil" | ||
[EmacsVar]: <> "sentence-end-double-space: t" | ||
[EmacsVar]: <> "fill-column: 70" | ||
[EmacsVar]: <> "coding: utf-8" | ||
[EmacsVar]: <> "End:" | ||
[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: " |