Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pp+: update UIP-0119 #67

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
214 changes: 126 additions & 88 deletions UIPS/UIP-0119.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
uip: "0119"
title: Pretty Printer Improvements
description: Make the Hoon pretty printer robust and customizable
author: ~sidnym-ladrut (@sidnym-ladrut)
status: Draft
author: ~sidnym-ladrut (@sidnym-ladrut), ~fidwed-sipwyn
status: Final
type: Standards Track
category: Hoon
created: 2024-01-20
Expand All @@ -13,7 +13,7 @@ created: 2024-01-20

The Hoon pretty printer (i.e. [standard library 5c][stdlib-5c]) is broadly
responsible for transforming `$type` and `$vase` nouns into `$tank`
equivalents, generally for the purpose of rendering these nouns in a
equivalents, generally for the purpose of rendering these nouns in a
human-legible format. While its current implementation admirably performs this
task for nearly all simple and semi-complex nouns, it suffers from performance
and legibility problems for common nontrivial inputs such as meta-`$vase`
Expand Down Expand Up @@ -53,6 +53,24 @@ with outputs for nouns like:
summary syntax, which is difficult to parse for an app developer trying to
determine the function's signature (doccords helps with this somewhat,
but cannot easily print door arms).
4. `^-(hoon 1)`: Prints:
```hoon
-need
?(
[p=#1 q=#1]
... rest of the hoon type here ...
[%sggr p=?([p=@tas q=#1] @tas) q=#1]
[%sgwt p=@ud q=#1 r=#1 s=#1]
[%sgcb p=#1 q=#1]
[%sgbr p=#1 q=^#1.#hoon]
)
-have.@ud
```
instead of:
```hoon
-need.#hoon :: $hoon is named with $+
-have.@ud
```

## Specification

Expand All @@ -66,101 +84,121 @@ with outputs for nouns like:
>
> – Ted Blackman (`~rovnys-ricfer`), ["Developer Week: Core Dev AMA (2022)"][cdama-22]

- In a similar vein to [`+easy-print`][stdlib-ep], develop a parallel version
of the [`+us` door][stdlib-5c] that leverages the existing `[%hint [%know
mark=@tas] …]` type structure to enable `%mark`-like type identification and
corresponding custom printing functions. This new door will assume the
temporary name `+ur`.
- Introduce the concept of a pretty printer gate type `$ppin` with signature
`$-([inp=(each type vase) bas=$-((each type vase) tank)] tank)`.
- The `bas` argument is the base pretty printer gate that will be used by
`+ur`, which is passed to each `$ppin` in order to enable recursive `$tank`
building.
- The following is a rough sketch of a `$ppin` gate for the `$unit` type:
```hoon
|= [inp=(each type vase) bas=$-((each type vase) tank)]
^- tank
?- -.inp
%& :: type
?> ?=(%hint -.p.inp)
:+ %rose [" " "u(" ")"]
[(bas %& q.p.inp)]~
::
%| :: vase
?> ?=(%hold -.p.p.inp)
=+ ;;(=type (slot 2 (~(play ut p.p.p.inp) q.p.p.inp)))
:+ %rose [" " "[" "]"]
:~ [%leaf '~' ~]
(bas %| type q.p.p.inp)
==
==
```
- Make the `+ur` door sample a copy of `+us` with the additional internal
state of a verbosity setting `veb` and a pretty printer map `pin`:
- Rewrite and replace the [`+us` door][stdlib-5c] to leverage the existing
`[%hint [%know mark=@tas] …]` type structure to enable `%mark`-like type
identification and corresponding custom printing functions.
- The `+us` door new sample will consist of a maximum depth `dep`, a
verbosity setting `veb` and a pretty printer map `pin`:
```hoon
++ ur
=> |%
+$ ppin $-([(each type vase) $-((each type vase) tank)] tank)
--
=+ :* veb=*?(%base %most %lest) :: default verbosity
pin=*(map term ppin) :: print overrides
==
=+ sur=type
++ us
=> |%
+$ tase (each type vase) :: type/vase
+$ meta :: recursion metadata
$~ [~ ~ 30 %base ~] ::
$: saw=(set tase) :: recursive types
ids=(map type @) :: types id
dep=@ud :: maximum depth
veb=?(%base %most %lest) :: default verbosity
pin=(map term ppin) :: print overrides
== ::
+$ base $-([tase meta] (unit [meta tank])) :: base printer
+$ ppin $-([tase meta base] (unit [meta tank])) :: custom printer
--
::
|%
:: … arms start here …
```
- `veb` is the verbosity setting used by the pretty printer for all type
marks not spanned by `pin`, which has three proposed flavors: (1) `%base`
(the current behavior), (2) `%most` (raw noun dumps), and `%lest`
(aggressive `%know` name substitution)
- `dep` will be decremented each time that a `%cell` is found, until
it reaches zero, at which point printing will stop with `[...]`.
The maximum depth was primarily introduced to prevent the common
issue of the PP hanging for minutes and even causing out of memory
errors, observed when trying to print very large outputs (e.g. cores
as raw nouns).

- `veb` is the verbosity setting used by the pretty printer and has three
flavors:

| %base | %most | %lest |
|-------------------------|--------------------------------|-----------------|
| (map @tas @ud) | ?(%~ [n=[p=@t q=@ud] l=[...]]) | #t/#map |
| u(1) | [~ u=1] | u(1) |
| {[p=a q=1] [p=b q=2]} | [n=[p='b' q=2] l=~ r=[...]] | {[a 1] [b 2]} |

`%base`: Search for a custom printer (ppin), if not found, fallback to
the default printer (hardcoded). If the latter is not present,
perform name substitution (similar to `%lest`).
`%most` Ignore hints and prints a more extensive output.
`%lest` For types: Aggressive `%know` name substitution for all named
types (+$). For vases: Hide all faces. Shorter core printing on
both `type` and `vase` printing.

- `pin` enumerates a set of type marks and associated printing functions
that will override the default `veb` printing behavior for those types.
- Extract the heuristic type identification logic from `+dole:us` into a new
gate `+doxx:ur` with signature `$-(type type)` that annotates the input
with `%know` notes for the common types (e.g. `list`, `tree`, `unit`, etc.).
- Analogous to the above, create a parallel door to `+ut` nominally named `+uq`
that contains a modified nesting algorithm in `+dext:nest:uq` that reports
`$type` errors as diffs instead of a sequence of comparative dumps. This could
be approached a couple of different ways:
- *Text Diff*: Leverage the [text diffing
algorithm](https://github.com/urbit/urbit/blob/develop/pkg/arvo/sys/zuse.hoon#L3966)
currently in `zuse` to present a line-based diff.
- *Tree Diff*: Leverage the [noun diffing
algorithm](https://github.com/urbit/urbit/pull/6681) implemented by
`~racfer-hattes` (i.e. @ilyakooo0) in order to present a tree-based diff.
- Segregate these changes to a standalone `/lib/uip119/hoon` file, which can be
integrated into `/sys/hoon/hoon` as part of a future `hoon` Kelvin decrement.
- `+ur` and `+uq` are designed to be drop-in replacements for `+us` and `+ut`,
respectively.
- Optionally, introduce `$+` hints to common types recognized by `+doxx:ur`
in `/sys/hoon/hoon`.
- Optionally, consider making the internal `+us` configuration more
accessible (like the [`+rs` door][stdlib-rs]) at the cost of backward
compatibility (breaking change to all external `+us` door invocations).
that will be called for the types named with `+$` and for all the cases
of `$type`.
- The pretty printer gate `$ppin` has the following signature:
`$-([tase meta base] (unit [meta tank]))`.
- Where `$tase` is the input to be printed: `(each type vase)`.
- `$meta` is used to pass some metadata down into the recursion.
- The `base` argument is the base pretty printer gate that will be
used by `+us`, which is passed to each `$ppin` in order to enable
recursive `$tank` building.
- The output is wrapped in a unit because sometimes we return ~ on
error, allowing the `%fork` printer can continue to the next case
of the `%fork`.
- The following is a rough sketch of a `$ppin` gate for the `$unit` type:
```hoon
|= [inp=tase sen=meta bas=base]
^- (unit [meta tank])
=+ typ=?-(-.inp %& p.inp, %| p.p.inp)
?> ?=([%fork *] typ)
=+ yed=(sort ~(tap in p.typ) aor)
?> ?=([* [[%cell * [%face *]] ~]] yed)
?- -.inp
%& :: type
?~ res=(bas [%& +<+>+>.yed] sen) ~
`[-.u.res [%rose [" " "u(" ")"] +.u.res ~]]
%| :: vase
(bas inp sen)
==
```
- We introduce $+ hints for commonly used types, such as `$set`, `$map`, and
`$unit`, among others.

- Create a new gate `+doxx` with signature `$-(type type)` thats used internaly
by the PP to heuristically annotate the input with `%know` notes for the
`%list` types (e.g. `tape`, `path`, `wall`, etc.).

- A new generator `:dojo|pp-config` was introduced, allowing the dev to update
`%dojo`'s PP config.

- The +nest arm has been updated to avoid throwing unnecessary hints (as
mentioned with ^-(hoon 1) above). It also now repeats the error location
if the output exceeds an arbitrary size of 100 lines (80 characters each).

- Several default printers have been reworked, including the gate printing,
which changed its style from `<number-of-arms.hash ...subject>` to
`$-(input output)`.

## Rationale

- Using an embedded and customizable pretty printer facility in `+us` has two
major advantages:
- The existing "special case" print functions (e.g. for `$tree`s, `$face`s,
etc.) can be integrated into a unified and holistic architecture.
- Each `$type`-based print function can be fine tuned to the abstraction
layer of the developer—core developers can use verbose, near-noun
representations and app developers can use terse, near-`$+`/`%know`
presentations.
- Presenting type nesting failures as diffs reduces output noise, drastically
reducing deciphering time and improving developer experience.
- The fixed PP logic is simple and consists of recursively traversing the
type/vase, and for each case of $type (%atom, %cell, %core, etc.), calling
the associated printer functions, all of which can technically be overridden
by a $ppin gate. The dep makes it flexible to constrain outputs on demand,
while veb allows for general customization. Together, these changes make the
pretty printer robust and customizable.

## Backwards Compatibility

- Development can safely be performed using parallel arms and structures,
enabling extensive vetting/testing prior to integration.
- Using the new stack of `$type`-related structures will break backward
compatibility and require and `hoon` Kelvin decrement.
- Replacing `+us` and `+ut` with their the parallel counterparts `+ur` and
`+uq` will break backward compatibility with all code dependent on pretty
printer outputs, but will retain interface continuity. Optionally, the `+us`
door can be replaced to make pretty printer overriding simpler at the cost of
an interface discontinuity.
- Replacing the old pretty printer will break backwards compatibility with any
code that directly calls the pretty printer's door. However, it will not
marcusmiguel marked this conversation as resolved.
Show resolved Hide resolved
affect code that uses functions such as `+sell/+skol/+duck:ut/+dunk:ut`. It
marcusmiguel marked this conversation as resolved.
Show resolved Hide resolved
may also break code that relies on specific pretty printer outputs.
## Future work

- The `$tank` type could be modified or replaced to be more expressive.
- A good noun diff algorithm could be used to make `test.hoon`'s diffs smaller.

## Acknowledgements

Expand Down