Clean up the use of double angular brackets

erlang · Nov 17, 2023 · 43f96cb · 43f96cb
1 parent 025f666
commit 43f96cb
Showing 1 changed file with 63 additions and 65 deletions.
diff --git a/eeps/eep-0066.md b/eeps/eep-0066.md
@@ -45,8 +45,8 @@ Design Decisions
 ----------------
 
 In the following text double angle quotation marks are used to
-mark source code characters in a paragraph.  For example:
-«`.`» means the dot character (full stop).
+mark source code characters to improve clarity.
+For example: the dot character (full stop): «`.`».
 
 ### Erlang Language Structure (Tokenizer and Parser)
 
@@ -76,9 +76,9 @@ much state and looks just a few fixed number of characters ahead
 in the input.
 
 For example; from the start state, if the tokenizer sees
-a «`'`» character, it switches state to scanning a quoted atom.
-While doing so it translates escape sequences such as «`\n`»
-(into ASCII 10) and when it sees a «`'`» character it produces
+a `'` character, it switches state to scanning a quoted atom.
+While doing so it translates escape sequences such as `\n`
+(into ASCII 10) and when it sees a `'` character it produces
 an atom token and goes back to the start state.
 
 ### Problems with simple prefixes
@@ -92,7 +92,7 @@ The tokenizer would have to know of all combinations of prefix characters
 and emit distinct tokens for every combination.
 
 Today, the character sequence «`b`», «`f`», «`"`» is scanned as a token
-for the atom «`bf`» followed by the string start token «`"`».
+for the atom `bf` followed by the string start token `"`.
 That combination fails in the parser so it is syntactically invalid today,
 which is what makes simple prefixes a possible language extension.
 
@@ -107,30 +107,30 @@ Furthermore, it is likely that we want the feature of choosing
 
     re(^"+.*/.*$)
 
-Among the desired delimiters are «`/`» and «`<`»+«`>`».  The currently
-valid code «`b<X`» meaning atom «`b`» less than «`X`», would instead
-have to be interpreted as prefixed string start «`b<`» with «`X`»
+Among the desired delimiters are `/` and `< >`.  The currently
+valid code «`b<X`» meaning atom `b` less than `X`, would instead
+have to be interpreted as prefixed string start `b<` with `X`
 being the first string content character.
 
-For the «`/`» character we run into similar problems with for example
+For the `/` character we run into similar problems with for example
 «`b/X`», which would be a run-time error today, but if we also would
 want capital letter prefixes, then «`B/X`» is perfectly valid today
 but would become a string start.
 
 There are more likely problems with simple string prefixes:
-«`#bf{`» is today the start of a record named «`bf`», and is
-scanned as punctuation character «`#`», atom «`bf`» and separator «`{`»,
-which the parser sorts out to be a record start.
+«`#bf{`» is today the start of a record named `bf`, and is
+scanned as punctuation character `#`, atom `bf` and separator `{`,
+which the parser figures out to be a record start.
 
 With simple prefix characters the tokenizer would have to be rewritten
 to recognize «`#bf`» as a new record token, a rewrite that might cause
 unexpected changes in record handling.  For example, today, «`# bf {`»
-is also a valid record start, so to be completely compatible the tokenizer
+is also a valid record start, so to be compatible the tokenizer
 would have to allow white-space or even newlines within the new record
-token, between «`#`» and the atom characters, which would be really ugly...
+token, between `#` and the atom characters, which would be really ugly...
 
 For other reasons, namely that function call parenthesis are optional,
-Elixir has chosen to use the «`~`» character as the start of
+Elixir has chosen to use the `~` character as the start of
 a string prefix which they call a "[Sigil][1]".
 
 Having a distinct start character for this feature simplifies
@@ -139,39 +139,39 @@ tokenizing and parsing.
 ### Sigil
 
 In a general sense, a [Sigil][3], is a prefix to a variable
-that indicates its *type*, such as «`$I`» in Basic or Perl,
-where «`$`» is the sigil and «`I`» is the variable.
+that indicates its *type*, such as `$I` in Basic or Perl,
+where `$` is the sigil and `I` is the variable.
 
 Here we define a Sigil as a prefix (and a suffix) to a string literal
 that indicates how it should be *interpreted*.  The Sigil is
 a *syntactic sugar* that creates some Erlang term.
 
 A Sigil string literal consists of:
 
-1. The [Sigil Prefix][], «`~`» followed by a name that may be empty.
+1. The [Sigil Prefix][], `~` followed by a name that may be empty.
 2. The [String Content][] within [String Delimiters][].
 3. The [Sigil Suffix][], a name character sequence that may be empty.
 
 A Sigil looks like a string with a prefix (and maybe a suffix),
 but expands to some term (or expression), so it cannot be subject
 to the string concatenation the parser does.
 
-Therefore `"abc" "def"` is `"abcdef"` but `~s"abc" "def"`
+Therefore «`"abc" "def"`» is `"abcdef"` but «`~s"abc" "def"`»
 should be illegal, and also all other sequences consisting
 of a Sigil of any type, and any other term, in any order.
 
 ### Sigil Prefix
 
-The Sigil Prefix starts whith the Tilde character «`~`», followed
+The Sigil Prefix starts whith the Tilde character `~`, followed
 by the Sigil Type which is a name composed of a sequence of characters
 that are allowed as the second or later characters in a variable or an atom.
-In short ISO [Latin-1][] letters, digits, «`_`» and «`@`».
+In short ISO [Latin-1][] letters, digits, `_` and `@`.
 The Sigil Type may be empty.
 
 The Sigil Type defines how the [Sigil][] syntactic sugar
 shall be interpreted.  The suggested Sigil Types are:
 
-* «»: the vanilla (default) [Sigil][].
+* «»: the vanilla (default (empty name)) [Sigil][].
 
   Creates an Erlang `unicode:unicode_binary()`.
   It is a string represented as a UTF-8 encoded binary,
@@ -191,44 +191,44 @@ shall be interpreted.  The suggested Sigil Types are:
   the first and most desired missing string feature in Erlang.
   This sigil does just that.
 
-* «`b`»: `unicode:unicode_binary()`
+* `b`: `unicode:unicode_binary()`
 
   Creates a UTF-8 encoded binary, handling escape characters
   in the string content.  Other features such as string interpolation
   will require another Sigil Type or using the [Sigil Suffix][].
 
-  In Elixir this corresponds to the «`~s`» sigil, a [string][4].
+  In Elixir this corresponds to the `~s` sigil, a [string][4].
 
-* «`B`»: `unicode:unicode_binary()`, verbatim.
+* `B`: `unicode:unicode_binary()`, verbatim.
 
   Creates a UTF-8 encoded binary, with verbatim string content.
   The content ends when the end delimiter is found.
   There is no way to escape the end delimiter.
 
-  In Elixir this corresponds to the «`~S`» sigil, a [string][4].
+  In Elixir this corresponds to the `~S` sigil, a [string][4].
 
-* «`s`»: `string()`.
+* `s`: `string()`.
 
   Creates a Unicode codepoint list, handling escape characters
   in the string content.  Other features such as string interpolation
   will require another Sigil Type or using the [Sigil Suffix][].
 
-  In Elixir this corresponds to the «`~c`» sigil, a [charlist][5].
+  In Elixir this corresponds to the `~c` sigil, a [charlist][5].
 
-* «`S`»: `string()`, verbatim.
+* `S`: `string()`, verbatim.
 
   Creates a Unicode codepoint list, with verbatim string content.
   The content ends when the end delimiter is found.
   There is no way to escape the end delimiter.
 
-  In Elixir this corresponds to the «`~C`» sigil, a [charlist][5].
+  In Elixir this corresponds to the `~C` sigil, a [charlist][5].
 
-* «`R`»: regular expression.
+* `R`: regular expression.
 
   This EEP proposes to not implement regular expressions yet.
   It is still unclear how integration with the `re` module
   should be done, and if it is worth the effort compared
-  to just using the «`S`» or the «`B`» Sigil Type.
+  to just using the `S` or the `B` Sigil Type.
 
   The best idea so far was that this sigil creates a term
   `{re,RE::unicode:charlist(),Flags::[unicode:latin1_char()]}`
@@ -245,7 +245,7 @@ shall be interpreted.  The suggested Sigil Types are:
   the regular expression rules.
 
   The main advantage of a regular expression [Sigil][] is to avoid
-  the additional escaping of «`\`» that regular erlang strings require.
+  the additional escaping of `\` that regular erlang strings require.
 
   Today: `re:run(Subject, "^\\s*\"[a-z]+\\\\\\d+\"", [caseless,unicode])`
 
@@ -264,9 +264,9 @@ since they are often a source for hard to find problems.
 
 These proposed Sigil Types are named according to the corresponding
 Erlang types.  The Sigil Types in [Elixir][1] are named according to
-Elixir types.  So, for example, a «`~s`» Sigil Prefix in Erlang
+Elixir types.  So, for example, a `~s` Sigil Prefix in Erlang
 creates an Erlang `string()`, which is a list of Unicode codepoints,
-but in Elixir the «`~s`» Sigil Prefix creates an Elixir [String][4]
+but in Elixir the `~s` Sigil Prefix creates an Elixir [String][4]
 which is a UTF-8 encoded binary.
 
 Consistency within the language is supposedly more important
@@ -280,13 +280,13 @@ A specific start delimiter character has a corresponding
 end delimiter character.
 
 The allowed start-end delimiter character pairs are:
-«`()`», «`[]`», «`{}`», «`<>`» and «`«»`».
+`( ) [ ] { } < > « »`.
 
 The following characters are start delimiters that have themselves
-as end delimiters: «`/`», «`|`», «`'`», «`"`» and «`#`».
+as end delimiters: `/ | ' " #`.
 
 Triple-quote delimiters are also allowed, that is; a sequence of
-3 or more double quote «`"`» characters as described in [EEP 64][].
+3 or more double quote `"` characters as described in [EEP 64][].
 
 For a given [Sigil Type][] except the [Vanilla Sigil][],
 which String Delimiters that are used does not affect how
@@ -297,20 +297,19 @@ doesn't occur in the string's content, so interpreting the string content
 does not interfere with finding the end delimiter.
 
 The proposed set of delimiters is the same as in [Elixir][1],
-plus «`«»`» and «`#`».  They are the characters in [Latin-1][]
+plus `« »` and `#`.  They are the characters in [Latin-1][]
 that are normally used for bracketing or text quoting,
-and those that feel like full height vertikal lines.
-Except: «`\`» is too often used for character escaping,
-«`` `» and «`´`» look too much like «`'`»,
-«`¦`» looks too much like «`|`», and «`#`» is too useful
-to *not* include since it in many contexts (shell scripts,
-Perl regular expressions) it is a comment character than
-is easy to avoid in the [String Content][].
-
-It may not be obvious how to type the «`«`» and «`»`» characters
+and those that feel like full height vertikal lines,
+except: `\` is too often used for character escaping,
+`` ` `` and `´` look too much like `'`, `¦` looks too much like `|`,
+and `#` is too useful to *not* include since it in many contexts
+(shell scripts, Perl regular expressions) it is a comment character
+that is easy to avoid in the [String Content][].
+
+It may not be obvious how to type the `«` and `»` characters
 on some keyboards (US), but there *are* ways that should not
-hinder a determined programmer.  When using X Compose sequences
-it is simply [`Compose`] [`<`] [`<`] and [`Compose`] [`>`] [`>`].
+discourage a determined programmer.  When using X Compose sequences
+it is simply [Compose] [<] [<] and [Compose] [>] [>].
 
 ### String Content
 
@@ -322,7 +321,7 @@ of indentation and leading and trailing newline is done as usual
 as described in [EEP 64][].
 
 In a string with single character [String Delimiters][],
-normal Erlang escape sequences prefixed with «`\`» are honoured,
+normal Erlang escape sequences prefixed with `\` are honoured,
 as usual for regular Erlang strings and quoted atoms
 
 A specific [Sigil Type][] can have it's own character escaping rules,
@@ -338,10 +337,10 @@ of name characters.
 
 The Sigil Suffix may indicate how to interpret the String Content,
 for a specific [Sigil Type][].
-For example; for the «`~R`» [Sigil Prefix][] (regular expression),
+For example; for the `~R` [Sigil Prefix][] (regular expression),
 the Sigil Suffix is interpreted as short form compile options
 such as «`i`» that makes the regular expression character
-case insensitive.  For example `~R/^from: /i`.
+case insensitive.  For example «`~R/^from: /i`».
 
 Things that may have to be performed by the tokenizer, such as
 how to handle escape character rules, should not be affected
@@ -419,38 +418,37 @@ should represent an *uncompiled* regular expression with compile flags.
 
 ### Comparison with Elixir
 
-The [Vanilla Sigil][] (empty [Sigil Type][]) is not allowed in Elixir.
+There is no [Vanilla Sigil][] (empty [Sigil Type][]) in Elixir.
 
 This EEP proposes to add the following [String Delimiters][]
-to the set that Elixir has: «`«»`» and «`#`».
+to the set that Elixir has: `« » #`.
 
 The string and binary [Sigil Type][]s are named differently
 between the languages, to keep the names consistent within
-the language (Erlang): «`~s`» in Elixir is «`~b`» in Erlang,
-and «`~c`» in Elixir is «`~s`» in Erlang, so «`~s`» means
+the language (Erlang): `~s` in Elixir is `~b` in Erlang,
+and `~c` in Elixir is `~s` in Erlang, so `~s` means
 different things, because strings are different things.
 
 When Elixir allows escape sequences in the [String Content][]
 it also allows string interpolation.  This EEP proposes to *not*
 implement string interpolation in the suggested [Sigil Type][]s.
 
-
 When Elixir doesn't allow escape sequences in the [String Content][],
 it still allows escaping the end delimiter.  This EEP proposes
 that such strings should be truly verbatim whith no possibility
 to escape the end delimiter.
 
 There are small differences in which escape sequences that are implemented
 in the languages; Elixir allows escaping of newlines, and has
-an escape sequence «`\a`», that Erlang does not have.
+an escape sequence `\a`, that Erlang does not have.
 
 There are also small differences in how newlines are handled
-between «`~S`» heredocs in Elixir and triple-quoted strings in Erlang.
+between `~S` heredocs in Elixir and triple-quoted strings in Erlang.
 See [EEP 64][].
 
-Details about regular expression sigils, «`~R`», in particular
+Details about regular expression sigils, `~R`, in particular
 their [Sigil Suffix][]es remains to be decided in Erlang.
-Also, there is a question about escaping the end delimiter or not.
+Also, there still is a question about escaping the end delimiter or not.
 
 It has not been decided how or even *if* string interpolation
 will be implemented in Erlang, but a [Sigil Suffix][] or
@@ -459,8 +457,8 @@ new [Sigil Type][]s would most probably be used.
 Reference Implementation
 ------------------------
 
-[PR-7684][] Implements the «`s`», «`S`», «`b`», «`B`»
-and the «``» (vanilla) Sigil, according to this EEP.
+[PR-7684][] Implements the `~s`, `~S`, `~b`, `~B`
+and the `~` (vanilla) Sigil, according to this EEP.
 
 The tokenizer produces a `sigil_prefix` token before the string literal,
 and a `sigil_suffix` token after.  The parser merges and transforms them