Skip to content

Constraints based Wollok Type System

Nahuel Palumbo edited this page Aug 24, 2021 · 27 revisions

Welcome to the Constraints-based Wollok Type System. Paper aquí

What is a Type System?

A Type System is a method for reasoning about programs, classifying expressions according to the kinds of values they compute

-- Benjamin Pierce, Types and Programming Languages, MIT Press

Main purpose

A type system has several objectives:

  • It can determine whether a program has errors or not (related to language safety)
  • It helps to find concepts and abstractions (specially for large systems)
  • They provide important information for documentation tools such as Static Diagram, Outline view, Autocompletion, etc. ...
  • ...and in Wollok language, which has no explicit type annotations, it is a key tool for type inference.

General Overview

Main process begins in ConstraintBasedTypeSystem class, in package org.uqbar.project.wollok.typesystem.constraints, and consists in several phases:

  • Initialization
  • Analysis
  • Type inference
  • Error reporting

Initialization

wolloktypesystem_initialization 1

In initialization process

  • we add all core WKOs, such as assert, void, game, error and Wollok Game keys
  • a TypeVariablesRegistry is created. This registry is a Map of URI => TypeVariable associations
  • TypeVariablesRegistry is populated via WollokCoreTypeDeclarations
  • main method in WollokCoreTypeDeclarations is declarations, which has a special DSL for defining core (lib) types:
PairType.constructor(Any, Any)
PairType >> "getKey" === #[] => Any;
Number >> "between" === #[Number, Number] => Boolean
Number % Number => Number

This DSL is implemented in superclass TypeDeclarations, with methods like operator_doubleArrow, constructor, etc. For example:

def operator_doubleArrow(List<? extends TypeAnnotation> parameterTypes, TypeAnnotation returnType) {
	new MethodTypeDeclaration(parameterTypes, returnType)
}

In a near future, we could change this behavior replacing it by annotations in lang, lib and other wlk core files. Meanwhile we should annotate all contructors and methods of all wko, classes and mixins coded in core files.

When does initialization happen? Right now it is a validator, but since Type System must infer all types around the project and not just one file, an issue was created.

Logger

Logger is configured thru log4-normal.properties files. It is important to set additivity configuration in order to avoid duplicated logs.

log4j.additivity.org.uqbar=false
log4j.additivity.org.eclipse=false
log4j.additivity.org.uqbar.project.wollok.typesystem=false

Default configuration is set to DEBUG, if you want to change severity please pay attention to this line:

log4j.logger.org.uqbar.project.wollok.typesystem=DEBUG, stdout

Analysis

ConstraintGenerator is responsible for analysis phase, its main method is generateVariables.

wolloktypesystem_analysis

When analysing a file (a WFile) there are two steps:

  1. Adding globals, like wko and classes.
  2. Whole constraint generation process
def dispatch void generateVariables(WFile it) {
	eContents.forEach[addGlobals]
	eContents.forEach[generateVariables]
}

Constraint generation process traverses AST nodes, with different consequences: for example, if you have a class, you must analyse its

  • hierarchy (parents)
  • mixins
  • members
  • constructors
def dispatch void generateVariables(WClass it) {
	// TODO Process supertype information: parent and mixins
	members.forEach[generateVariables]
	constructors.forEach[generateVariables]
}

In case of a String literal, like "wollok", you must associate it with String Wollok type.

def dispatch void generateVariables(WStringLiteral it) {
	newSealed(classType(STRING))
}

👇

We should ensure a dispatch method generateVariables() for every Wollok grammar element (WTest, WStringLiteral, WTry, etc.)

For example, if we have this file definition:

object pepita {
	var energy = 0
	
	method fly(kms) {
		energy = energy - kms
	}
}

Corresponding AST will be:

wolloktypesystem_ast

So, there must be a generateVariables dispatch method for WFile, WNamedObject, WVariableDeclaration, WMethodDeclaration, etc.

Type information model

ConstraintGenerator defines the variable generation strategy, which is delegated to TypeVariablesRegistry class. Before we go on it is important to study how type information is modeled.

wolloktypesystem_typeinfo

Every variable has its type (TypeVariable class), which in turn has

  • subtypes
  • supertypes
  • type info
  • and owner
  • errors are reported by type system

An example

Given this hierarchy

wolloksupertypesandsubtypes

Let's analyze this Java-like code

Animal a = new Dog()
  • Animal is a supertype for a,
  • while Dog is a subtype for a

In this example

Animal a = if (b) new Dog() else new Cat()
  • Animal is a supertype for a,
  • while Dog and Cat are subtypes for a

If we adopt a nominal type, variable a has Animal type. Otherwise, we can use structural types, so variable a can have a set of types: [Dog, Cat]

Wollok min and max types

Working in Wollok, we can have this definitions

a = new Dog()
a = new Cat()

Which types can a adopt? minTypes is a set: #{Dog type, Cat type}, determined by assignments; whilst in this sentence

a.walk()

maxTypes (maximalConcreteTypes) are solved. Which types can respond to walk() message? In our example, Dog and Cat. If we had this sentences:

a.walk()
a.meow()

maximalConcreteTypes has only an element: it must be Cat type, according to the messages they send. If you are in a method like this

method doSomething(a) {
   a.walk() 
   a.meow()
}

and there is no assignment, Wollok type system must infer minType based on messages sent (maxTypes). So variable a must be a Cat.

When maxType and minType match, then we can seal a variable. This implies you don't need to process inference type any longer.

Type inference

After analysis, type inference process is executed through the following steps:

  • adding cross reference constraints (for overriding methods and constructor calls)
  • running a list of strategies in different stages (there is a matrix so you can chain a strategy in a specific stage)

For example, these are some of the strategies and stages configured to run:

Stage Strategies
0 #[PropagateMinimalTypes, PropagateMaximalTypes, MaxTypesFromMessages]
1 #[OpenMethod]
2 #[UnifyVariables, SealVariables]
3 #[GuessMinTypeFromMaxType]

Each time you run a stage, some of the strategies can provide new information to the type system. If not, we can go to next stage. If there is something new to know, we must start inference process from stage 0 again. This process runs until in all stages all strategies find no new information about expression types.

Inference strategies

Inference strategies are classified using the following hierarchy

wolloktypesystem_inferencestrategies 1

TODO: Explanation of each strategy

Error report

Clone this wiki locally