-
Notifications
You must be signed in to change notification settings - Fork 16
Constraints based Wollok Type System
Welcome to the Constraints-based Wollok Type System. Paper aquí
A Type System is a method for reasoning about programs, classifying expressions according to the kinds of values they compute
-- Benjamin Pierce, Types and Programming Languages, MIT Press
A type system has several objectives:
- It can determine whether a program has errors or not (related to language safety)
- It helps to find concepts and abstractions (specially for large systems)
- They provide important information for documentation tools such as Static Diagram, Outline view, Autocompletion, etc. ...
- ...and in Wollok language, which has no explicit type annotations, it is a key tool for type inference.
Main process begins in ConstraintBasedTypeSystem class, in package org.uqbar.project.wollok.typesystem.constraints, and consists in several phases:
- Initialization
- Analysis
- Type inference
- Error reporting
In initialization process
- we add all core WKOs, such as assert, void, game, error and Wollok Game keys
- a TypeVariablesRegistry is created. This registry is a Map of URI => TypeVariable associations
- TypeVariablesRegistry is populated via WollokCoreTypeDeclarations
- main method in WollokCoreTypeDeclarations is declarations, which has a special DSL for defining core (lib) types:
PairType.constructor(Any, Any)
PairType >> "getKey" === #[] => Any;
Number >> "between" === #[Number, Number] => Boolean
Number % Number => Number
This DSL is implemented in superclass TypeDeclarations, with methods like operator_doubleArrow, constructor, etc. For example:
def operator_doubleArrow(List<? extends TypeAnnotation> parameterTypes, TypeAnnotation returnType) {
new MethodTypeDeclaration(parameterTypes, returnType)
}
In a near future, we could change this behavior replacing it by annotations in lang, lib and other wlk core files. Meanwhile we should annotate all contructors and methods of all wko, classes and mixins coded in core files.
When does initialization happen? Right now it is a validator, but since Type System must infer all types around the project and not just one file, an issue was created.
Logger is configured thru log4-normal.properties files. It is important to set additivity configuration in order to avoid duplicated logs.
log4j.additivity.org.uqbar=false
log4j.additivity.org.eclipse=false
log4j.additivity.org.uqbar.project.wollok.typesystem=false
Default configuration is set to DEBUG, if you want to change severity please pay attention to this line:
log4j.logger.org.uqbar.project.wollok.typesystem=DEBUG, stdout
ConstraintGenerator is responsible for analysis phase, its main method is generateVariables.
When analysing a file (a WFile) there are two steps:
- Adding globals, like wko and classes.
- Whole constraint generation process
def dispatch void generateVariables(WFile it) {
eContents.forEach[addGlobals]
eContents.forEach[generateVariables]
}
Constraint generation process traverses AST nodes, with different consequences: for example, if you have a class, you must analyse its
- hierarchy (parents)
- mixins
- members
- constructors
def dispatch void generateVariables(WClass it) {
// TODO Process supertype information: parent and mixins
members.forEach[generateVariables]
constructors.forEach[generateVariables]
}
In case of a String literal, like "wollok", you must associate it with String Wollok type.
def dispatch void generateVariables(WStringLiteral it) {
newSealed(classType(STRING))
}
👇
We should ensure a dispatch method generateVariables() for every Wollok grammar element (WTest, WStringLiteral, WTry, etc.)
For example, if we have this file definition:
object pepita {
var energy = 0
method fly(kms) {
energy = energy - kms
}
}
Corresponding AST will be:
So, there must be a generateVariables dispatch method for WFile, WNamedObject, WVariableDeclaration, WMethodDeclaration, etc.
ConstraintGenerator defines the variable generation strategy, which is delegated to TypeVariablesRegistry class. Before we go on it is important to study how type information is modeled.
Every variable has its type (TypeVariable class), which in turn has
- subtypes
- supertypes
- type info
- and owner
- errors are reported by type system
Given this hierarchy
Let's analyze this Java-like code
Animal a = new Dog()
- Animal is a supertype for a,
- while Dog is a subtype for a
In this example
Animal a = if (b) new Dog() else new Cat()
- Animal is a supertype for a,
- while Dog and Cat are subtypes for a
If we adopt a nominal type, variable a has Animal type. Otherwise, we can use structural types, so variable a can have a set of types: [Dog, Cat]
Working in Wollok, we can have this definitions
a = new Dog()
a = new Cat()
Which types can a adopt? minTypes is a set: #{Dog type, Cat type}, determined by assignments; whilst in this sentence
a.walk()
maxTypes (maximalConcreteTypes) are solved. Which types can respond to walk() message? In our example, Dog and Cat. If we had this sentences:
a.walk()
a.meow()
maximalConcreteTypes has only an element: it must be Cat type, according to the messages they send. If you are in a method like this
method doSomething(a) {
a.walk()
a.meow()
}
and there is no assignment, Wollok type system must infer minType based on messages sent (maxTypes). So variable a must be a Cat.
When maxType and minType match, then we can seal a variable. This implies you don't need to process inference type any longer.
After analysis, type inference process is executed through the following steps:
- adding cross reference constraints (for overriding methods and constructor calls)
- running a list of strategies in different stages (there is a matrix so you can chain a strategy in a specific stage)
For example, these are some of the strategies and stages configured to run:
Stage | Strategies |
---|---|
0 | #[PropagateMinimalTypes, PropagateMaximalTypes, MaxTypesFromMessages] |
1 | #[OpenMethod] |
2 | #[UnifyVariables, SealVariables] |
3 | #[GuessMinTypeFromMaxType] |
Each time you run a stage, some of the strategies can provide new information to the type system. If not, we can go to next stage. If there is something new to know, we must start inference process from stage 0 again. This process runs until in all stages all strategies find no new information about expression types.
Inference strategies are classified using the following hierarchy
TODO: Explanation of each strategy