New frontend AST--motivation and high-level design principles #8233

kazcw · 2023-11-06T15:21:51Z

kazcw
Nov 6, 2023
Collaborator

Source code edits are not currently easy in GUI2, and the implementation is currently broken. We could fix the current design, but not without increasing the complexity of the implementation. An alternative design could provide a better interface for application logic to build on and isolate the necessary complexity to a few independent mechanisms, improving development speed and application reliability.

The IDE must support graph-driven edits and text edits (codemirror, external file changes); this implies multiple representations of the edited program. To achieve consistent behavior, we choose a representation to use as the single source of truth. Currently, in GUI2 (as in GUI1) we use text as this canonical representation. At one time this design was well-suited to the architecture of GUI1. However, as the IDE has become more advanced our paradigm has shifted. Edits have gone from being essentially textual to being tree-driven. Translating tree edits to text edits is complex: maintaining the ID map correctly is difficult; text-pasting tree construction is messy; Y.Text conflict resolution is wrong for our purposes. To better support the modern widget-tree model of node edits, we should change to a tree-structured canonical representation of programs.

The parser-produced Trees are not suitable for this working representation. The parser output describes the source code syntactically. It does so efficiently; it also provides a lot of redundant information for convenience (i.e. for each node it provides a span, which is equivalent to the recursive sum of the node's tokens). These goals of efficiency and convenient analysis are at odds with editability, which is a core requirement for the GUI. We are missing a step to reconcile them: We should translate the parser's Tree to a GUI-specific representation. Currently, we extend Tree by wrapping it. This bridges the gap partway, but cannot provide editability--for that, we must perform all our analysis up front and throw away the parsed Tree objects.

When we have an IDE-optimized tree structure, we can use it for GUI edits, and to synchronize the node graph. Tree edits can be kept simple by using an abstract and referentially-transparent representation: An abstract tree has fewer invariants to maintain, and allows the GUI logic to focus on the intent of a change; a referentially-transparent tree can be modified without any bookkeeping in associated data structures. Correct CRDT synchronization should be achieved by thoughtful design of the concrete representation of the frontend AST graph. Text edits will take on some of the complexity that we are eliminating from node edits, as ID maps now needs to be maintained through text edits instead of for node edits. However, the diff algorithm that will support this is fairly straightforward, and it's self-contained. As a side-benefit, it will enable us to repair the metadata map after raw text edits.

I will post a PR to the design repo shortly, going into more detail on a design that will meet these goals.

JaroslavTulach · 2023-11-10T04:33:27Z

JaroslavTulach
Nov 10, 2023
Collaborator

we should change to a tree-structured canonical representation of programs.

Shall we? My more than twenty years long experience of developing an IDE suggests that AST based editors are associated with a lot of quirks and in general not suitable for humans. People just don't think in terms of ASTs. As such I don't believe AST based representation is suitable when working with Enso's textual representation.

3 replies

kazcw Nov 10, 2023
Collaborator Author

What I am proposing is not a UI change--our interface is already an AST/text hybrid editor. We abstract higher levels of the tree the most:
At any given time we display the subtrees of a particular body-block (the rest of the graph is abstracted away entirely). For each assignment in the block, we draw a node (the exact order of code lines is abstracted away). Within each node, we delineate arguments (informed by type information as much as by syntax); we offer non-textual (abstracted according to type) and textual (fully concrete) means of editing each argument. My proposed change is necessary to support this hybrid paradigm in a consistent way. It is a requirement for the multi-client synchronization we would like to offer, and it makes it much easier to avoid surprising results of text edits.

In the new GUI, we would like to offer CRDT-based synchronization. When a CRDT merges edits textually, it tends to break the syntax tree. Syntax changes introduced by merge conflict resolution can affect locations of lines in the file and order of arguments in a function-application node. Because these conflicts can occur at levels of the syntax tree that we render in a highly-abstracted way, and are different from the intent of any of the edits that cause them, they lack understandability to the user. Only AST-based synchronization can ensure that when edits are merged, syntax is respected.

The GUI also offers essentially text-based edits, of expressions below a certain level in the syntax tree. The representation I'm proposing is designed with this in mind. After any AST edit, a print-parse process is used to normalize the modified subtree (this ensures that our AST after an edit is the AST that would result from parsing the corresponding text--not doing this would definitely introduce a risk of quirks). This edit process allows textual edits--but with the ability to contain the edit to a subexpression. This will be especially important once we have syntactic constructs that can span sibling lines (such as multiline if/else), which could allow a line to consume the rest of the block, which would break our core graph abstraction.

The proposal is a hybrid model for a hybrid editor. It allows us to use a tree paradigm at the highest levels (as the GUI does) and for synchronization, while supporting the kind of text edits we need, in a tree-informed way. I think for us, it avoids a lot of quirks.

JaroslavTulach Nov 11, 2023
Collaborator

we display the subtrees of a particular body-block

Do you mean we display just a single function like Main.func1 or main?

JaroslavTulach Nov 11, 2023
Collaborator

When a CRDT merges edits textually, it tends to break the syntax tree.

and

The GUI also offers essentially text-based edits, of expressions below a certain level in the syntax tree.
The representation I'm proposing is designed with this in mind.

lead me to a question:

What do we want to do with external textual edits?

Imagine the IDE is running, but user opens the source in an external text editor and changes it. What will happen then?

From what I hear you only propose to support "text-based edits, of expressions below a certain level in the syntax tree." - however that's not what you get with external edits. Even slightest external edit is a big bang change. Yet unless Enso wants to give up on its dual including textual representation story, we need a way to turn even (common) external modifications into reasonably sized CRDT-tree change.

How does your proposal cover this scenario?

PS: You showed me your AST matching demo a year ago, so I assume we both know what needs to be done to support external edits effectively. I am just reminding them to make sure their support is included in the plans we are making.

JaroslavTulach · 2023-11-10T04:36:10Z

JaroslavTulach
Nov 10, 2023
Collaborator

There are certainly some publications analyzing this topic, but let's start with MPS example. There is a reason why the same company is not building its other tools on top of MPS! Neither their Java, Ruby, Python, etc. support builds on the AST based representation that MPS offers.

4 replies

JaroslavTulach Nov 11, 2023
Collaborator

The biggest problem is that AST based editors only allow users to make

text-based edits, of expressions below a certain level in the syntax tree

however as users don't think in terms of trees, it is hard to explain them why they cannot edit the next/enclosing AST element when positioned in a certain level of the syntax tree. To make an example up, let's consider case expression:

y = case x of
  1 -> "Good"

with a cursor located at the end of "Good" AST editors tend to prevent users from pressing enter and typing 2 -> "Bad". Or pressing enter and typing y.length.

I am not aware of any literature that would offer consistent solution to this problem without giving up on AST being the primary representation of the source.

hubertp Nov 16, 2023
Collaborator

I think there is also some prior work like Hazel and they even briefly mention conflict resolution at the end of the talk.

Overall my concern is that we are aiming to introduce a yet another significant change to supporting a feature ("multi-user editing") that has not been supported in the old GUI. I understand the motivation but the translation was supposed to be more or less 1 to 1, at least in the first version.

somebody1234 Nov 20, 2023
Collaborator

@JaroslavTulach

expressions below a certain level
i beg to differ - there is nothing stopping an AST editor for binding Enter to append a new child - especially since they know the structure of the entire AST anyway.

(and in most editors it should ideally be simple enough to implement - either rely on native event bubbling to catch an input at a higher level, or keep a(n optionally weak) reference to the parent node and handle the keypress there)

somebody1234 Nov 20, 2023
Collaborator

@hubertp IIRC the primary motivations are not multi-user editing, but rather single-user editing with multiple representations:

the graph editor
the builtin code editor
the file containing the source code, that is stored on disk

iow: i don't think we can avoid this as a problem without abandoning support for text-based editing completely (?)

sirinath · 2023-11-18T16:06:23Z

sirinath
Nov 18, 2023

Can you post a link to the PR?

Is it Enso parser in Enso itself?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enso Analytics

New frontend AST--motivation and high-level design principles #8233

{{title}}

Replies: 3 comments 7 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Enso Analytics

New frontend AST--motivation and high-level design principles #8233

kazcw Nov 6, 2023 Collaborator

Replies: 3 comments · 7 replies

JaroslavTulach Nov 10, 2023 Collaborator

kazcw Nov 10, 2023 Collaborator Author

JaroslavTulach Nov 11, 2023 Collaborator

JaroslavTulach Nov 11, 2023 Collaborator

What do we want to do with external textual edits?

JaroslavTulach Nov 10, 2023 Collaborator

JaroslavTulach Nov 11, 2023 Collaborator

hubertp Nov 16, 2023 Collaborator

somebody1234 Nov 20, 2023 Collaborator

somebody1234 Nov 20, 2023 Collaborator

sirinath Nov 18, 2023

kazcw
Nov 6, 2023
Collaborator

Replies: 3 comments 7 replies

JaroslavTulach
Nov 10, 2023
Collaborator

kazcw Nov 10, 2023
Collaborator Author

JaroslavTulach Nov 11, 2023
Collaborator

JaroslavTulach Nov 11, 2023
Collaborator

JaroslavTulach
Nov 10, 2023
Collaborator

JaroslavTulach Nov 11, 2023
Collaborator

hubertp Nov 16, 2023
Collaborator

somebody1234 Nov 20, 2023
Collaborator

somebody1234 Nov 20, 2023
Collaborator

sirinath
Nov 18, 2023