Skip to content

Commit

Permalink
Merge pull request #70 from robertoaloi/erlang-error-index
Browse files Browse the repository at this point in the history
EEP 74: Erlang Error Index
  • Loading branch information
kikofernandez authored Dec 16, 2024
2 parents 5f3f970 + c4e0325 commit e8dd334
Show file tree
Hide file tree
Showing 2 changed files with 246 additions and 0 deletions.
Binary file added eeps/eep-0074-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
246 changes: 246 additions & 0 deletions eeps/eep-0074.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
Author: Roberto Aloi <prof3ta(at)gmail(dot)com>
Status: Draft
Type: Standards Track
Created: 11-Nov-2024
Erlang-Version: OTP-28
Post-History:
****
EEP 74: Erlang Error Index
----

Abstract
========

The **Erlang Error Index** is a _catalogue_ of errors emitted by
various tools within the Erlang ecosystem, including - but not limited
to - the `erlc` Erlang compiler and the `dialyzer` type checker.

The catalogue is not limited to tools shipped with Erlang/OTP, but it
can include third-party applications such as the [EqWAlizer][]
type-checker or the [Elvis][] code style reviewer.

Each error in the catalogue is identified by a **unique code**
and it is accompanied by a description, examples and possible courses
of action. Error codes are _namespaced_ based on the tool that
generates them. Unique codes can be associated to a human-readable
**alias**.

Unique error codes can be leveraged by IDEs and language servers to
provide better contextual information about errors and make errors
easier to search and reference. A standardized error index creates a
common space for the Community to provide extra examples and
documentation, creating the perfect companion for the Erlang User
Manual and standard documentation.

Rationale
=========

The concept of an "Error Index" for a programming language is not a
novel idea. Error catalogues already exist, for example, in the
[Rust][] and [Haskell][] Communities.

Producing meaningful error messages can sometimes be challenging for
developer tools such as compilers and type checkers due to various
constraints, including limited context and character count.

By associating a **unique code** to each _diagnostic_ (warning or
error) we relief tools from having to condense a lot of textual
information into a - sometime cryptic - generic, single
sentence. Furthermore, as specific wording of errors and warnings is
improved over time, error codes remain constant, providing a
search-engine friendly way to index and reference diagnostics.

An good example of this is the _expression updates a literal_ error
message, introduced in OTP 27. Given the following code:

-define(DEFAULT, #{timeout => 5000}).

updated(Value) ->
?DEFAULT#{timeout => Value}.

The compiler emits the following error:

test.erl:8:11: Warning: expression updates a literal
% 8| ?DEFAULT#{timeout => 1000}.
% | ^

The meaning of the error may not be obvious to everyone. Most
importantly, the compiler provide no information on why the warning is
raised and what a user could do about it. The user will then have to
recur to a search engine, a forum or equivalent to proceed.

Conversely, we can associate a unique identifier to the code (say,
`ERL-1234`):

test.erl:8:11: Warning: expression updates a literal (ERL-1234)
% 8| ?DEFAULT#{timeout => 1000}.
% | ^

The code make it possible to link the error message to an external
resource (e.g. a wiki page), which contains all the required,
additional, information about the error that would not be practical to
present directly to the user. Here is an example of what the entry
could look like for the above code:

![Erlang Error Index Sample Entry][]

Unique error codes also have the advantage to be better searchable in
forums and chats, where the exact error message could vary, but the
error code would be the same.

Finally, error codes can be used by IDEs (e.g. via language servers)
to match on error codes and provide contextual help. Both the [Erlang
LS][] and the [ELP][] language server already use "unofficial" error
codes.

Emitting Diagnostics
--------------------

To make it easier for language servers and IDEs, tools producing
diagnostics should produce diagnostics (errors and warnings) in a
standardized format. In the case of the compiler, this could be done
by specifying an extra option (e.g. `--error-format json`).

A possible JSON format, heavily inspired by the [LSP protocol][], is:

```json
{
uri: "file:///git/erlang/project/app/src/file.erl",
range: {
start: {
line: 5,
character: 23
},
end: {
line: 5,
character: 32
}
},
severity: "warning",
code: "DIA-1234",
doc_uri: "https://errors.erlang.org/DIA/DIA-1234",
source: "dialyzer",
message: "This a descriptive error message from Dialyzer"
}
```

Where:

* **uri**: The path of the file the diagnostic refers to, expressed using the [RFC 3986][] format
* **range**: The range at which the message applies, zero-based. The range should be as strict as possible. For example, if warning
the user that a record is unused, the range of the diagnostic should
only cover the name of the record and not the entire definition. This
minimizes the distraction for the user when, for example, rendered as
a squiggly line, while conveying the same information.
* **severity**: The diagnostic's severity. Allowed values are `error`, `warning`, `information`, `hint`.
* **code**: A unique error code identifying the error
* **doc_uri**: A URI to open with more information about the diagnostic error
* **source**: A human-readable string describing the source of the diagnostic
* **message**: A short, textual description of the error. The message should be general enough and make sense in isolation.

Error Code Format
-----------------

An error code should be composed by two parts: an alphanumeric
_namespace_ (three letters) and a numeric identifier (four digits),
divided by a dash (`-`).

A potential set of namespaces could look like the following:

| Namespace | Description |
|-----------|-----------------------------------------------------------------|
| ERL | The Erlang compiler and related tools (linter, parser, scanner) |
| DIA | The Dialyzer type-checker |
| ELV | The Elvis code-style reviewer |
| ELP | The Erlang Language Platform |
| ... | ... |

A set of potential error codes could look like:

ERL-0123
DIA-0009
ELV-0015
ELP-0001

The exact number of characters/digits for each namespace and code is
open for discussion, as well as the fact whether components such as
the parser, the scanner or the `erlint` Erlang linter should have
their own namespace.

Responsibilities
----------------

The Erlang/OTP team would be ultimately responsible for maintaining a
list of _official_ namespaces. Each tool maintainer would then be
responsible to allocate specific codes to specific diagnostics.

Processes
---------

The error index can be implemented in the format of Markdown pages. The
approval process for a namespace (or an error code) will follow a
regular flow using a Pull Request, reviewed and approved by the
Erlang/OTP team and, potentially, other interested industrial members.

Errors cannot be re-used. If a tool stops emitting an error code, the
_deprecated_ error code is still documented in the index, together
with a deprecation notice. This is to avoid re-using a single code for
multiple purposes.

To limit the administration burden, the section will contain only
error codes for the tools shipped with Erlang/OTP and the namespaces
for external tools. Individual error codes for each namespace would be
managed by the respective owners.

Reference Implementation
------------------------

The [ELP website][] contains a proof of concept of what an Erlang
Error Index could look like. Ideally, such a website would live under
the `erlang.org` domain, e.g. using the `https://errors.erlang.org/` URL.

The website should use _Markdown_ as the primary mechanism to write
content and it should be easily extensible by the Community.

Copyright
=========

This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.

[EqWAlizer]: https://github.com/whatsapp/eqwalizer
"The EqWAlizer Type Checker"

[Elvis]: https://github.com/inaka/elvis
"The Elvis Style Reviewer"

[Rust]: https://doc.rust-lang.org/error_codes/error-index.html
"The Rust Error Index"

[Haskell]: https://errors.haskell.org
"The Haskell Error Index"

[Erlang Error Index Sample Entry]: eep-0074-1.png
"Erlang Error Index Sample Entry"

[Erlang LS]: https://github.com/erlang-ls/erlang_ls/blob/a4a12001e36b26343d1e9d57a0de0526d90480f2/apps/els_lsp/src/els_compiler_diagnostics.erl#L237
"Erlang LS using error codes"

[ELP]: https://github.com/WhatsApp/erlang-language-platform/blob/99a426772be274f3739116736bb22d4c98c123c4/erlang_service/src/erlang_service.erl#L608
"ELP using error codes"

[ELP Website]: https://whatsapp.github.io/erlang-language-platform/docs/erlang-error-index/
"ELP website"

[LSP Protocol]: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#diagnostic

[RFC 3986]: https://datatracker.ietf.org/doc/html/rfc3986

[EmacsVar]: <> "Local Variables:"
[EmacsVar]: <> "mode: indented-text"
[EmacsVar]: <> "indent-tabs-mode: nil"
[EmacsVar]: <> "sentence-end-double-space: t"
[EmacsVar]: <> "fill-column: 70"
[EmacsVar]: <> "coding: utf-8"
[EmacsVar]: <> "End:"
[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "

0 comments on commit e8dd334

Please sign in to comment.