Architecture proposal #2

LPeter1997 · 2022-07-31T11:01:57Z

I'd like to propose an architecture for this tool to be as versatile, easily extensible and testable as possible. This would involve decoupling the input language from the class hierarchy and transformations, and decoupling the output format from those as well.

The components would be:

Core library: responsible for modeling the inheritance tree and providing model transformations
Input language library/libraries: responsible for building up the inheritance tree and transformations from some input language (YAML, JSON, ...)
Templating library: wraps up the tree model (sort of like in red-green trees) in the core library to provide a more extensive viewmodel for a templating engine, like Scriban
Output language library/libraries: Using the viewmodel provided by the templating component, it generates output for a specific language
CLI: A command-line interface to drive all of this as a simple to invoke .NET tool

In the following sections I'd like to detail these components slightly more.

Core library

Class hierarchy

The core could provide the model for the inheritance tree. It could look something like so (just a sketch):

// Describes a compiler pass
record CompilerPass(
	string Name,
	string Documentation,
	// Transformations to apply to get the tree based on the previous pass
	IList<Transformation> Transformations,
	TreeHierarchy Tree,
	CompilerPass? PreviousPass,
	CompilerPass? NextPass
);

// Wraps up an entire hierarchy
// Not necessary, but makes the API nicer
record TreeHierarchy(
	IDictionary<string, TreeNodeClass> Nodes
);

// Describes a single class in a hierarchy
record TreeNodeClass(
	string Name,
	string Documentation,
	TreeNodeClass Parent,
	IDictionary<string, TreeNodeMember> Members,
	// Language-specific things could be here
	// Sealed? Abstract? Some applied attribute for Python?
	ISet<object> Attributes
);

// Describes a single member/property in a class
record TreeNodeMember(
	string Name,
	string Documentation,
	// Dynamic languages might not have a type
	string? Type,
	// Language-specific things could be here
	// Public? Apply some attribute? Leave out from pretty-printing?
	ISet<object> Attributes
);

Something like this wouldn't be too language-specific, but isn't too general either to be practically useless. Things like the type specification could be elaborated better, if needed. Also, read-write properties would be nicer for such an API, I only used records for the simple syntax.

Tree transformation

The key operation the core would provide is tree transformation. It would take a tree hierarchy as an input, apply a transformation that would result in a new tree hierarchy. This is how the passes would build up their trees. Transformations could optionally be applied on nodes matching a certain pattern. A possible API:

interface ITreeNodePattern
{
	public bool IsRecursive { get; }
	public bool Matches(TreeNodeClass c);
}

interface ITreeTransformer
{
	public ITreeNodePattern? Pattern { get; }
	public TreeHierarchy Apply(TreeHierarchy h);
}

Built-in transformations we could provide (and we could extend later):

Add a node
Remove node
Add a member to node
Remove a member from node

Built-in patterns we could provide (and we could extend later):

Node with given name
Node with name matching a regex
Node with given member(s)
All nodes

Rationale for the scope

I believe this is a well-testable and easily extensible component. The rest deal with input and output, which likely means mostly integration and end-to-end tests will apply to them. This component can be unit-tested to oblivion with all the patterns and transformations.

Input language libraries

These would be less interesting libraries, taking an input language and then transforming it to the core library representations, describing passes. Most likely it would invoke some existing language parser, like YAML or JSON, but it could also be some custom notation. I wouldn't focus on developing many of these "front-ends" until the core has a stable enough API. Note, that the input languages don't have to expose 100% of the core features. It's perfectly fine to only support the necessities.

Templating library

The templating library would wrap up the tree into a more redundant data structure that is more easily consumed by template engines. For example, these node wrappers would provide navigation to both the parent and children, or they could list all members, including the inherited ones. To stay language-agnostic, these should be generic wrappers, that the language-specific wrappers could re-use. For example, this library could ship a node wrapper something like this:

abstract class TreeNodeClassView<TSelf>
	where TSelf : TreeNodeView<TSelf>
{
	private readonly TreeNodeClass underlying;

	protected virtual bool HasAttribute(object attr) => underlying.Attributes.Contains(attr);

	public TSelf Parent => /* wrap up the parent in this type */;
	public IEnumerable<TSelf> Derived => /* wrap up the derived classes in this type */;

	// ...
}

Output language libraries

The output language libraries would adapt the wrappers in the templating library to the destination language (this is why the wrappers are abstract and generic). For example, adapting it to C#:

class CSharpTreeNodeClassView : TreeNodeClassView<CSharpTreeNodeClassView>
{
	// Specialize things like attributes to be specific to C#
	public bool IsSealed => HasAttribute(CSharpAttribs.Sealed);
	public bool IsAbstract => HasAttribute(CSharpAttribs.Abstract);

	// ...
}

The libraries would ship the required templates:

A template for generating a class hierarchy
A template for generating a visitor base class

Optionally, the library would ship a language formatter, or have the knowledge to invoke a pre-installed language formatter.

thinker227 added this to the Language overhaul milestone Jul 31, 2022

thinker227 added enhancement New feature or request frontend The CLI interface or other user-facing issues functionality Core functionality and features labels Jul 31, 2022

thinker227 mentioned this issue Aug 1, 2022

Update tree/pass model #5

Merged

thinker227 pinned this issue Sep 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture proposal #2

Architecture proposal #2

LPeter1997 commented Jul 31, 2022

Architecture proposal #2

Architecture proposal #2

Comments

LPeter1997 commented Jul 31, 2022

Core library

Class hierarchy

Tree transformation

Rationale for the scope

Input language libraries

Templating library

Output language libraries