osa1

Internship report - type inference, row polymorphism, and multi-stage programming

April 15, 2013 - Tagged as: types, plt, en, multi-stage programming.

This is the report of my 3.5-month internship fulfilled at Ozyegin University under Prof. Baris Aktemur’s supervision.

I wrote this report as a short and informal introduction to the topics I worked on in my internship, and since I know nobody will ever read this internship report, I decided publishing it in my blog with hoping someone benefit from it.

A note before reading: Things went in an unexpected way shortly after I started writing the report: I got bored. So each chapter got shorter and shorter, leaving tons of interesting and important stuff unmentioned. Sorry for that.

Introduction to type systems and polymorphism

As far as software engineering concerned, a type system is a static analysis method that aims to prove a program ‘won’t go wrong’¹, meaning that the program will not crash at runtime. Static type checking can also be used for some compiler optimisations. Also, some languages have mechanisms for dispatching on types.

Type systems (in this text, this term is used as ‘static type systems’) do that by inspecting program terms and proving some properties and relations among them. This work is done before run-time, and most type information can be deleted so that it won’t cause any runtime costs.

For most type systems, process of type checking is ‘syntactic’. This means two things; first, the type checker only needs program text and it doesn’t need any information that can only be obtained at run-time. And second, type checking a program term only depends on type checking subterms of that program. These two properties also give us a somewhat simple way to express type checking on terms in a symbolic and formal way.

Type systems are ‘conservative’, meaning they will reject some ‘correct’ programs. A correct program means a program that won’t crash at runtime. For instance, this program(in OCaml syntax):

if true then 1 else 42(5)

is actually correct, and it can be given the type int (since we know it always evaluates to 1, which has type int), but almost all type systems reject this program because of the ‘else’ branch expression 42(5) that is ill-typed.

Research on type systems aims to have more “powerful” type systems. Being more powerful means accepting more correct programs and also being able to encode more invariants in the type system, so that more incorrect programs will be rejected.

Being wrong and being incorrect is used with different meanings in this text: a wrong program will crash at run-time, but incorrect program will result with a wrong answer. For instance, a wrong implemented algorithm may not be wrong, but it’s incorrect. So it will work fine, but return a wrong result.

As an instance of accepting more correct programs, this code is not typeable under simply-typed lambda calculus:

let id a = a in
(id 1, id true)

But it’s well-typed in polymorphic lambda calculus. As a second example, this is not typeable under polymorphic lambda calculus:

add5 : float -> float
...
add5 10

Here 10 has type int, so we can’t apply it to a function that expects a float value. Type systems with subtyping overcome this problem by defining a common supertype of types. In our case, float is already a supertype of int, so this example is well-typed in the presence of subtyping.

As an example of being able to encode more invariants in the type system, let’s look to the type of a sort function from a dependently-typed programming language:²

sort : List Int -> List Int

This type can be inhabited with some wrong functions. For instance, a function that takes a list and returns an empty list will have this type, just like reverse and shuffle functions.

sort : Vect Int n -> Vect Int n

This type has a invariant encoded in it: length of it’s input and output should be equal. This means our first function that takes a list and returns an empty list now can’t be inhabited by this type. We have a more precise type.

sort : (xs : Vect Int n) ->
       (ys : Vect Int n ** Permutation xs ys)

This type is even more precise, a function that takes a list and returns a list with some random elements can’t get this type.

One problem with more expressive type systems is that there may be some types that can’t be inferred by type system, but the type can be given by programmer. In some cases the type system may not be able to prove that an expresion has the denoted type. In case of dependently typed languages, programmer is required to give some proofs to the type checker. But even in much simpler type systems programmer may have to give some types manually.

–

A word about the terms ‘polymorphism’ and ‘subtyping’: Polymorphic type systems can give multiple types to a single piece of code, thus providing reuse. This is done by giving the code a general type and then creating instances of type depending on the use. As an example, let us start with a simple OCaml function, fst.

fst returns first element of any pair, it has the type ('a * 'b) -> 'a (the language of types is described in next section). This means for any pair with type of ('a * 'b), fst will return value with type of 'a'. When applied on a value typed (bool * int) it will return bool, and when applied on a value typed ( (string * bool) * string ) it will return (string * bool) etc.

Subtyping defines a relationship that if a type t1 is subtype of a type t2, this means that t1 can be given to context where t2 is expected. This usually means t1 already has all properties of t2.

For instance, a record with fields a: int, b: bool can be passed to a function that expects a parameter typed {a: int}. The same effect can also be obtained by row polymorphism, which we’ll see in it’s own chapter.

Subtyping is generally divided into two branches; nominal and structural subtyping. Structural subtyping relation is syntactic, ie. type system can decide if one type is subtype of another type just by inspecting their forms (like in the example above). Nominal subtyping requires definition from the programmer or a pre-defined type lattice from the language designer, ie. there is no way for a type system to infer subtyping relation between int and float types if this relation is not built-in; similarly a Square is not a subtype of Shape unless the programmer manually defines this subtyping relation.

Hindley-Damas-Milner type system and type inference

Manually giving types to program terms may be impractical, since programs can contain hundreds of definitions. Type inference is the process of deriving types for program terms. As an example of language with type inference, OCaml programs with no type annotations can be type-inferred and checked. This results in more concise programs. Type annotations can still be given for documentation or error reporting purposes.

Hindley-Damas-Milner(abbreviated as HM henceforth) type system is likely to be the most widely-used type system. It has a great property that if a term is typeable under HM type system, HM inference algorithm will result with a most general type (also called ‘principal type’).

HM is the system lies behind statically typed functional languages like the ML family of languages and Haskell.

In HM, a polymorphic type is specified as a ‘type scheme’. A type scheme is a type with universally quantified type variables. Type system’s job then is to give terms of a program type schemes so that for each instantiation of the scheme, the result is well-typed.

HM type system produces polymorphic types (type schemes) only for some expressions, in the case of ML family of languages, it’s the let expression. A let expression binds a name to an expression and introduces a scope that the name is bound to the expression. HM type system gives a most general type to that name, and so every use of this name in the scope is valid if a valid instantiation of the required type can be obtained from the type scheme it’s given.

let id a = a in
(id 1, id true)

Here the id function is given the type scheme of forall a. a -> a. The variable a is called to be ‘universally quantified’. In id 1, ids type is instantiated from type scheme as int -> int by substituting a with int. Same operation is done in id true and id is given the type bool -> bool. So this term is well-typed under HM.

Inferring a type scheme is done by assigning fresh type variables to terms with unknown types and unifying this variables with concrete types while type checking rest of the term. At the end, types that are not unified with concrete types will remain polymorphic.

This is also called let-polymorphism because polymorphic type (type scheme) generation is only done in let expressions.

Note that not all names defined in a let expression can be generalized. As an example, see this classic example:

let c = ref (fun x -> x) in
c := (fun x -> x+1);
!c true

Giving c the type forall a. ('a -> 'a) ref (in OCaml syntax) makes this example well-typed, which is wrong. Instead, type system should have given the type ( '_a -> '_a) ref, then during the type checking c := (fun x -> x + 1), _a should be unified with int. Note that since ('_a -> '_a) ref doesn’t quantify the variable a, after this point, type of c will be (int -> int_ ref, so applying it a bool value will fail.

Let-polymorphism has some interesting properties. It’s simple, but it still accepts realistic programs. And even though its worst-case efficiency is exponential on the size of input program, most realistic programs don’t hit the worst case and have linear time complexity. For this reason, let-polymorphism is named as the ‘sweet-spot’ of type systems.

Another great property of HM type system is that it can be reduced to constraint generation and solving steps. These separate steps lead to a more modular algorithm, with changeable constraint generators and solvers. HM(X)³ is a formalization of this observation.

Row polymorphism

Rows are a way to encode ‘labeled products’. A product type is a type that contains some other types; for example, a pair is a product type of two other types ((int * bool) is a product type with int and bool parts).

A ‘labeled sum’ is a product type, but with labels. For instance:

type error = { code : int; reason : string }

is just like an (int * string) type, but it has labels for subparts. There are severals ways to encode labeled products as rows. Here’s one way:

type error = { code : abs int; reason : abs int; rho[code,int] }

This syntax is similar to Rémy’s, which was also used in our project. Last rho is the polymorphic part, it means that an error type is a row type with code and reason fields with gives types, but with a polymorphic part that can be unified with any record except the fields labeled with code or int (some systems accept field repetition, like in Daan Leijen’s “Extensible records with scoped labels”).

Row polymorphism gives us a subtyping-like effect. For example, this function:

fun x -> x.a + 1

Now has the type { a : int; rho[a] } -> int and we can apply it to the record { x : some_type; y : some_other_type; a : int; ... }, like in structural subtyping.

The way row polymorphism and structural subtyping accept this term are completely different though. In row polymorphism, parameter type {a : int; rho[a] } is being unified with { x : some_type; y : some_other_type; a : int; ... }, so parameter type of function is actually getting more specialized. On the other hand, in subtyping, extra fields of actual parameter is being forgotten, like casting a type to it’s super type.

More words will be said about row polymorphism in later sections.

Multi-stage programming

Multi-stage programming languages makes distinguished evaluation stages in run-time. Some part of the program can be generated depending on some user input.

Evaluation is done at stage 0. A program generated in staged 0 has stage 1, program generated in stage 1 has stage 2 etc. This notion of separate stages has some semantic effects. Each stage has a scope, but most systems offer a mechanism to lift a value from a lower stage.

Being able to generate code in run-time gives us a way to specialize algorithms depending on user input. Let’s look at a classic example:

let rec pow_body_gen n =
  if n = 0 then <1>
  else <a * ~(pow_body_gen (n-1))>;;

pow_body_gen takes an integer n and returns the code a * a * a * ... * 1, a to the power of n. With the help of this function, we can generate a specialized function power_five which raises an integer to the power 5, but without running a loop or recursively calling a function:

let rec power_five a = <let a = ~lift(a) in ~(pow_body_gen 5)>;;

For example, output of power_five 12 in our interpreter is <let a = 12 in (a * (a * (a * (a * (a * 1)))))>. This is a code value. Notice the a value, which is multiplied by itself, 5 times.

To actually run this code, we can call run(stage primitives will be explained shortly):

> run(power_five 12);;
248832

Our multi-stage language has three primitives for staging operations: angle brackets indicate a staged expression, contents of staged expression will be run in stage n+1, where n is the current stage level. Tilde (~) ‘unboxes’ a staged expression, expressions indicated with a unboxing operator will be run in stage n-1. Unboxing operator can only be used in stages n > 0. Finally, run primitive is used to run a closed code value at stage 0.

Note that code values can be open, but only closed code values can be run. Our pow_body_gen function returns a code value with an unbound variable a, then in power_five, we define a in stage 1, and by unboxing pow_body_gen 5, we actually generate a bigger code value, with a defined. Now our code value is closed, so it can be run.

There is also a relation between multi-stage programming and partial evaluation, but this is out of this text’s scope.

Our work

Wallace is a type inference library, supporting subtyping. “Its goal is to serve as a plug-in component in the design of a constraint-based type-checker, regardless of the programming language being analyzed.”⁴

We used Wallace to get subtyping in our language. Wallace comes with an example language, called ‘toy’, which has extensible records built-in. For simplicity, instead of generating constraints for Wallace, we first translated our multi-stage language to record calculus. Translation to record calculus preserves semantics, but eliminates staged computations⁵. Our record calculus is then translated to ‘toy’. ‘toy’ infers types for us, by generating and passing constraints to Wallace.

We then compared our type system with⁶. This expression:

run((fun c -> <(let x = 1 in ~c, let y = 1 in ~c)>) <1>);;

is not typeable under polymorphic type system, but is well-typed in our system with subtyping. By mapping the calculated type back to our multi-stage language, we got subtyping in the multi-stage language for free.

Wallace is used as is, without any modifications. But we changed ‘toy’ language for our purposes. We first added some numerical types with subtyping relations. ‘toy’ had a problem that it was generalizing every let bindings, without value restriction. To overcome this problem, we added an extra lambda wrapper in the translator when value restriction is needed. Since our type system already had value restriction, it was trivial to apply the same test to check if an expression is expansive. Expansive expressions are those that may allocate new memory cells. Expansive expressions are not generalized.

Another problem with ‘toy’ language is that it doesn’t support toplevel declarations, and adding that wasn’t easy. To overcome this without any serious modifications in ‘toy’s source, we collected all well-typed toplevel expressions and declarations, and then generated a let .. in .. chain of toplevel declarations to give ’toy’ type checker. This means for every new toplevel phrase, ‘toy’ now type checks all old expressions too. This may not be efficient but works fine with our purposes, and it was easy to implement.

Implementing type inference: Implementing a type inference system is hard to do elegantly. One problem is already discussed in my blog (in Turkish). Every non-trivial type system requires types to be separated into kinds. After that, every type variable will also have kind information and it’s a type error to unify a type variable with wrong kinds. Type applications also checked with the help of kinds.

Kinds are generally checked at run-time. Type systems with kinds usually lead to a simpler type language. For instance, almost all types mentioned in this text can be represented by this simply type language:

type kind =
  | KStar                 (* kind of term types *)
  | KRow                  (* kind of row types *)
  | KArr of (kind * kind) (* kind of type constructors *)

type ty =
  | TCon of tycon     (* constant *)
  | TVar of tyvar     (* type variable *)
  | TApp of (ty * ty)
      (* type application, to be well-typed
         kind of first ty should be KArr (k2, k)
         and second ty should be k2 *)
and tyvar = (tyvarlink ref * kind)
and tyvarlink =
  | NoLink of id (* just a type variable *)
  | LinkTo of ty (* equated to a ty *)
and tycon = (id * kind) (* kind should be always KStar *)

We decided to take a different path, instead of keeping kind information for every type, we created new type constructors for every type and then separating differently kinded type variables with different types. This lead us to this type language:

type ty =
 | TInt
 | TBool
 | TUnit

 | TPair of ty * ty

 | TList of ty
 | TRef  of ty

 | TFun of ty * ty
 | TRec of tyrec
 | TVar of typevar

 | TBox of tyrec * ty

...

and typevar = (tyvarlink * int) ref
and tyvarlink = ty link

and fieldvar = (fieldvarlink * int) ref
and fieldvarlink = field link

and recvar = (recvarlink * int * IdSet.t) ref
and recvarlink = tyrec link

We also needed a sum type for handling differently kinded type variables:

and linkvar =
 | TV of typevar
 | FV of fieldvar
 | RV of recvar

This also made our algorithms more complex, since we had more cases to handle.

This representation is not without any advantages though. For instance, writing a pretty-printer was very easy, because we could easily print a pair and a function or a list and a ref differently, even though they share similar structure, with simple pattern matching.

A note on pretty-printing: OCaml has a great pretty-printer library in stdlib. It’s simple but powerful at the same time, allowing printing most complex structures easily. It interprets some characters in input strings as boxes/spaces indicators etc. and formats the text. For instance, this code:

printf "@[<hov 2>~("; print_exp exp; printf ")@]"

Here @[<hov 2> part creates a new “horizontal or vertical” box. This box has a property that when a line is split to two lines, every other newline pointers will also be split into separate lines. So contents of this box is either displayed as a single line, or separated from every newline pointers. Later, print_exp exp; part prints the expression inside the box. And lastly, `printf “)@]” closes the paren, then closes the box, so that the text comes later will not be bound with this line split rule.

This gives an easy and concise way to print even most complex structures.