In this post I want to summarize some of the reasons why I don’t like OCaml and why I wouldn’t choose it for a new project today.
To me it’s absolutely essential that the language should have some way of defining interfaces, implementing those interfaces for the types, and programming against those interfaces.
In Haskell, this is done with typeclasses. Rust has a similar mechanism called traits. In languages with classes this is often done with abstract classes and “implementing” those classes in new classes (e.g. implements
in Dart).
In OCaml there’s no way to do this. I have to explicitly pass functions along with my values, maybe in a product type, or with a functor, or as an argument.
Regardless of how I work around this limitation, it’s extremely inconvenient. Things that must be trivial in any code base, such as converting a value to a string for debugging purposes, become a chore, and sometimes even impossible.
As far as I know, there was at least one attempt at ameliorating this with modular implicits (implicit parameter passing), but I don’t know what happened to it since 2017. It looks like it’s still not a part of the language and the standard library is not using it.
OCaml’s standard library is just bizarre. It has lots of small issues, and a few larger ones. It’s really just extremely painful to use.
Some examples of the issues:
Zoo of printing/debugging and conversion functions such as string_of_int
, string_of_float
, print_char
, Int64.of_int
, string_of_int
, …
Overly polymorphic operators with type 'a -> 'a -> bool
such as =
(called “structural equality”, throws an exception if you pass a function) and >
. Code that uses these operators will probably not work on user-defined types as expected.
Standard types are sometimes persistent, sometimes mutable. List
, Map
, and Set
are persistent. Stack
and Hashtbl
are mutable.
Inconsistent naming:
Map
is cardinal
, length function for Hashtbl
is length
.Bytes.t
, the big int type is Big_int.big_int
(instead of Big_int.t
). The functions in these modules are also inconsistently named. Big_int
functions are suffixed with _big_int
, Bytes
module functions are not prefixed or suffixed.The regex module uses global state: string_match
runs a regex and sets some global state. matched_string
returns the last matched string using the global state.
Lack of widely used operations such as popcount
for integer types, unicode character operations.
It doesn’t have proper string and character types: String
is a byte array, char
is a byte.
The bad state of OCaml’s standard library also causes fragmentation in the ecosystem with two competing alternatives: Core and Batteries.
OCaml doesn’t have a single-line comment syntax.
The expression syntax has just too many issues. It’s inconsistent in how it uses delimiters. for
and while
end with end
, but let
, if
, match
, and try
don’t, even though the right-most non-terminal is the same in all of these productions:
expr ::= ...
| while <expr> do <expr> done
| for <value-name> = <expr> ( to | downto ) <expr> do <expr> done
| let <let-binding> in <expr>
| if <expr> then <expr> [ else <expr> ]
| match <expr> with (| <pattern> [ when <expr> ] -> <expr>)+
| try <expr> with (| <pattern> [ when <expr> ] -> <expr>)+
...
It has for
and while
, but no break
and continue
. So you use exceptions with a try
inside the loop for continue
, and outside for break
.
It also has lots of ambiguities, and some of these ambiguities are resolved in an unintuitive way. In addition to making OCaml difficult to parse correctly, this can actually cause incorrect reading of the code.
Most common example is probably nesting match
and try
expressions:
match e0 with
try e1 with p2 -> e2
| p1 -> | p3 -> e3
Here p3 -> e3
is a part of the try
expression.
Another example is the sequencing syntax <expr> ; <expr>
and productions with <expr>
as the right-most symbol:
let test1 b =
if b then
print_string "1"
else
print_string "2"; print_string "3"
Here print_string "3"
is not a part of the if
expression, so this function always prints “3”.
However, even though match
also has <expr>
as the right-most symbol, it has different precedence in comparison to semicolon:
let test2 b =
match b with
true -> print_string "1"
| false -> print_string "2"; print_string "3" |
Here print_string "3"
is a part of the false -> ...
branch.
Try to guess how these functions are parsed:
(* Is the last print part of `else` or not? *)
let test3 b =
if b then
print_string "1"
else
let x = "2" in
print_string x;
print_string "3"
(* Is this well-typed? *)
let test4 b =
if b then
1, 2
else
3, 4
(* Is the type of this `(int * int) array -> unit` or `int array -> unit * int`? *)
let test5 a = a.(0) <- 1, 2
(* What if I replace `,` with `;`? Does this set the element 1 or 2? *)
let test6 a = a.(0) <- 1; 2
When writing OCaml you have to keep these rules in mind.
It also has the “dangling else” problem:
(* Is `else` part of the inner `if` or the outer? *)
if e1 then if e2 then e3 else e4
Finally, and I think this is probably the most strange thing about OCaml’s syntax and I’m not even sure what’s exactly happening here (I can’t find anything relevant in the language documentation), comments in OCaml are somehow tokenized and those tokens need to be terminated. They can be terminated inside another comment, or even outside. This is a bit difficult to explain but here’s a simple example:
(* " *)
print_string "hi"
OCaml 5.0.0 rejects this program with this error:
File "./test.ml", line 2, characters 16-17:
2 | print_string "hi"
^
String literal begins here
From the error message it seems like the "
in the comment line actually starts a string literal, which is terminated in the first quote of "hi"
. The closing double quote of "hi"
thus starts another string literal, which is not terminated.
However that doesn’t explain why this works:
(* " *)
print_string "hi"
(* " *)
print_string "bye"
If my explanation of the previous version were correct this would fail with an unbound hi
variable, but it works and prints “bye”!
I’m not following developments in OCaml ecosystem too closely, but just two years ago it was common to use Makefiles to build OCaml projects. The language server barely worked on a project with less than 50 kloc. There was no standard way of doing compile-time metaprogramming and some projects even used the C preprocessor (cpp).
Some of these things probably improved in the meantime, but the overall package is still not good enough compared to the alternatives.
Almost all modern statically typed languages have closures, higher-order functions/methods, lazy streams, and combinators that run efficiently. Persistent/immutable data structures can be implemented even in C.
Also, OCaml has no tracking of side-effects (like in Haskell), and the language and the standard library have lots of features and functions with mutation, such as the array update syntax, mutable record fields, Hashtbl
, and the regex module.
The only thing that makes OCaml more “functional” than e.g. Dart, Java, or Rust is that it supports tail calls. While having tail calls is important for functional programming, I would happily give up on tail calls if that means not having the problems listed above.
Also keep in mind that when you mix imperative and functional styles tail calls become less important. For example, I don’t have to implement a stream map
function in Dart with a tail call to map the rest of the stream, I can just use a while
or for
loop.
In my opinion there is no reason to use OCaml in a new project in 2023. If you have a reason to think that OCaml is the best choice for a new project please let me know your use case, I’m genuinely curious.
]]>All of the ideas shown in this post can be used to access a record field when the record’s concrete type is not known, but the type system guarantees that it has the accessed field. This includes row polymorphism and record subtyping.
Most of the ideas also work when the record’s type is completely unknown and it may not have the accessed field, but some of the optimizations assume accesses cannot fail. Those optimizations can only be used on statically-typed but polymorphic records.
In some of the examples below I will use row polymorphism.
In this blog post we are interested in a specific application of row polymorphism to records. In short, row polymorphism allows type variables denoting sets of record fields, with their types. For example:
: ∀ r . { x : Int, y : Int | r } -> Int
f = a.x + a.y f a
Here the type variable r
ranges over set of rows (or records). This function accepts any record as argument as long as the record has at least x : Int
and y : Int
fields.
The main difference between row polymorphism and record subtyping is that the type variable r
can be used in the right-hand side of an arrow as well, allowing passing the record around without losing its concrete type. For example:
: ∀ r . { a : Int, b : Int | r } -> (Int -> Int) -> { a : Int, b : Int | r }
mapAB = { a = f r.a, b = f r.b, .. r } mapAB r f
This function takes any record that has a : Int
and b : Int
fields, and returns a new record with updated a
and b
fields and the rest of the fields. If I pass it a record with type { a : Int, b : Int, name : String }
I get the same type back.
With subtyping, type of this function would look like:
: { a : Int, b : Int } -> (Int -> Int) -> { a : Int, b : Int } mapAB
In this version the return type just has a
and b
fields. Rest of the fields are lost. If I pass this a { a : Int, b : Int, name : String }
I get { a : Int, b : Int }
back. The name
field is lost.
Without subtyping, when the record type in a field access expression is known, it’s easy to generate efficient code: we use the same offsets used when compiling a record literal with the type.
With subtyping, and with row-polymorphism when the record type is not a concrete record type but is a record type with a row variable, type of r
in r.a
does not immediately give us where in the record’s payload the field a
is.
Let’s look at how we might go about implementing record field access in these cases.
I don’t think this idea is used in statically-typed languages, but I wanted to include it for completeness.
We can implement records as maps with string keys. Field access then becomes a map lookup.
This is easy to implement because our language probably already has a map implementation in the standard library.
The disadvantages are:
Depending on the map implementation, every field access require a O(N)
or O(log(N))
map lookup.
Map entries will be stored in a separate memory location (instead of in the record object’s payload), which will require pointer chasing to read the field value.
Unnecessary memory overhead caused by map fields that are not really necessary for records: such as the capacity
and size
fields.
With whole-program compilation, we can improve the constant factors a bit by mapping labels (field names) in the program to unique integers. This way lookups don’t require string hashing or comparison, but this is still slow and memory-inefficient compared to other techniques we will discuss below.
If you’re familiar with Haskell, this is the Haskell way of implementing row polymorphic records.
The idea is that when we pass a record to a row-polymorphic function, we also pass, implicitly, and as functions, the accessors that the function needs.
In Haskell, type of mapAB
we’ve seen above would look like this:
: ∀ r . (HasField r 'A Int, HasField r 'B Int) => Record r -> (Int -> Int) -> Record r mapAB
The runtime values for HasField ...
constraints are the accessors. When calling this function we don’t explicitly pass these accessors, the compiler generates them. In a well-typed program, we either have these values in the call site, or we know how to generate them (e.g. the record type is concrete in the call site), so it’s possible for the compiler to generate and pass these arguments.
The main advantage of this approach is that it doesn’t require any language support specifically for records.
The main disadvantages are:
Every field access is a function call.
Parameter passing per field per record does not scale well and causes messy and slow generated code. For example, suppose we want to take two records with fields x : Int
and y : Int
:
: ∀ r . (HasField r 'X Int, HasField r 'Y Int) => Record r -> Record r -> ... f
This function takes two implicit arguments, but it has a limitation that the record arguments need to have the same record types. I can’t call this function with two different records:
= 123, y = 456, a = "hi" } { x = 0, y = -1, b = false } f { x
For this to work I need two row variables:
: ∀ r1 r2 .
f HasField r1 'X Int, HasField r1 'Y Int,
(HasField r2 'X Int, HasField r2 'Y Int) =>
Record r1 -> Record r2 -> ...
This version works, but it also takes 4 implicit arguments.
Starting with the next approach, we will require mapping labels (field names) to integers in compile-time, to be used as indices.
Because these integers for labels will be used in record allocation and field accesses, it is possible that a label we see later in a program will cause different code generation for a record field access that we’ve already seen.
We have two options:
We can avoid this problem with a whole-program pass to collect all labels in the program.
This is trivial with a whole-program compiler as a front-end pass can store all labels seen in a component (library, module) somewhere and we can map those labels to integers before code generation.
We can have a link-time step to update record allocation and field access code with the integers for the labels.
In the rest of the post, labels will always get integers based on their lexicographical order and we will call these integers for labels just “labels”.
For example, if I have labels a
, c
, b
, d
in my program, their numbers will be 1, 3, 2, 4, respectively.
With integers as labels we can add a table to every record (records with the same set of keys sharing the same table) mapping labels in the program to offsets in the record’s payload. For example, the table for a record with fields a
and c
when the program has labels a
, b
, c
, d
, looks like this:
[ 0, _, 1, _ ]
This table is indexed by the label and the value gives the offset in the record’s payload for the field. _
means the record does not have the field. In a well-typed program we won’t ever see a _
value being read from a table.
This approach is quite wasteful as every table will have as many entries as number of labels in the program, but we will compress these tables below to reasonable sizes.
We will call these tables “record offset tables” or “offset tables” in short. When compiling a record access we need to get the record’s offset table. For this we add an extra word (pointer) to record objects pointing to their offset tables. We then generate this code for a record field access:
record[record[OFFSET_TABLE_INDEX][label]]
OFFSET_TABLE_INDEX
is the constant for where the offset table pointer is in record objects.
Offset tables are generated per record shape (set of labels), so the total number of tables shouldn’t be too large.
Since the _
entries won’t ever be used, we can shrink the tables with trailing _
entries. In our example above with a record with a
and c
fields, the last _
entry can be omitted:
[ 0, _, 1 ]
Because offset tables are per-shape, and the total number of record shapes in a program should be small, if we allocate a few bits in record object headers for the “shape index” of the record, this index can be used to index a global table mapping record shapes to their offset tables.
Generated code for record access expressions will look like:
record[RECORD_OFFSET_TABLES[getRecordShapeId(record)][label]]
getRecordShapeId
will read the bits in the object header for the record shape id. Depending on the actual header layout, it will look something like:
int getRecordShapeId(Object* object) {
return (object->header & RECORD_ID_MASK) >> HEADER_BITS;
}
With record shape IDs in headers and a global table mapping shape IDs to offset tables, we no longer need an extra word in record objects for the offset table pointer.
Here’s an example of offset tables when we have labels a
, b
, x
, y
, and two records 0: {a, b}
and 1: {x, y}
:
RECORD_0_OFFSET_TABLE = [
0, // label a
1, // label b
_, // label x
_, // label y
];
RECORD_1_OFFSET_TABLE = [
_, // label a
_, // label b
0, // label x
1, // label y
];
RECORD_OFFSET_TABLES = [
RECORD_0_OFFSET_TABLE, // record 0
RECORD_1_OFFSET_TABLE, // record 1
];
As before, the offset table for record 0 can be shrunk as:
RECORD_0_OFFSET_TABLE = [
0, // label a
1, // label b
];
Labels that are not used in the same record program can be given the same ID.
In the example above, this allows us to have a single table for both records:
RECORD_0_1_OFFSET_TABLE = [
0, // label a or x
1, // label b or y
];
RECORD_OFFSET_TABLES = [
RECORD_0_1_OFFSET_TABLE, // record 0
RECORD_0_1_OFFSET_TABLE, // record 1
];
The problem of assigning IDs to labels is very similar to stack allocation when spilling during register allocation. We have practically infinite amount of IDs (stack space), but we want to reuse the same ID for labels as long as they’re never used in the same record (live at the same time).
After sharing label IDs, some of the shapes may be identical, as in our example. We can give those shapes the same ID and avoid redundant entries in the offset tables.
With this, our example with two records {a, b}
and {x, y}
compiles to just one offset table:
RECORD_0_1_OFFSET_TABLE = [
0, // label a or x
1, // label b or y
];
RECORD_OFFSET_TABLES = [
RECORD_0_1_OFFSET_TABLE, // record 0 and 1
];
Suppose we have these record shapes in a program:
{a, b, q}
{x, y, q}
The RECORD_OFFSET_TABLES
table is currently an array of pointers, and indexing the offset table still requires pointer chasing.
To avoid pointer chasing we can flatten the table.
For our current program, the tables, without flattening, look like this:
RECORD_0_OFFSET_TABLE = [
0, // label a
1, // label b
_, // label x
_, // label y
2, // label q
];
RECORD_1_OFFSET_TABLE = [
_, // label a
_, // label b
0, // label x
1, // label y
2, // label q
];
RECORD_OFFSET_TABLES = [
RECORD_0_OFFSET_TABLE,
RECORD_1_OFFSET_TABLE,
];
We can flatten this as:
RECORD_0_OFFSET_TABLE = [
0, // label a
1, // label b
_, // label x
_, // label y
2, // label q
];
RECORD_1_OFFSET_TABLE = [
_, // label a
_, // label b
0, // label x
1, // label y
2, // label q
];
RECORD_LABEL_OFFSETS = [
0, // record 0, label a
1, // record 0, label b
_, // record 0, label x
_, // record 0, label y
2, // record 0, label z
_, // record 1, label a
_, // record 1, label b
0, // record 1, label x
1, // record 1, label y
2, // record 1, label z
];
Field indexing then becomes:
record[RECORD_LABEL_OFFSETS[(getRecordShapeId(record) * NUM_LABELS) + label]]
With this version we eliminate one layer of indirection.
The idea here is not too important on its own, but it will enable further improvements.
The NUM_LABELS
factor in field access code above can be eliminated by incrementing record shape IDs by NUM_LABELS
instead of 1. In our example, instead of having record IDs 0 and 1, we will have 0 and 5 (incremented by the number of labels in the program).
Since there may be large number of labels in a program and we may have only a few bits to store the record IDs, an alternative would be to convert the table to label-major order like this:
RECORD_LABEL_OFFSETS = [
0, // label a, record 0
_, // label a, record 1
1, // label b, record 0
_, // label b, record 1
_, // label x, record 0
1, // label x, record 1
_, // label y, record 0
2, // label y, record 1
3, // label z, record 0
3, // label z, record 1
];
With this table, indexing code becomes:
record[RECORD_LABEL_OFFSETS[(label * NUM_RECORDS) + getRecordShapeId(record)]]
We can then eliminate the NUM_RECORDS
factor the same way, by incrementing label IDs by NUM_RECORDS
instead of 1, and index with:
record[RECORD_LABEL_OFFSETS[label + getRecordShapeId(record)]]
Now that the table index of a label is label + shape_id
and we have a single table, we can shift the entries in the table by decrementing label IDs.
For this it doesn’t matter whether we store in label-major or record-major order. Which one of these will generate a smaller table will probably depend on the program. As an example, suppose we store the table in label-major order, and we have these records in the program:
0: {x, y, z, t}
1: {x, y}
2: {z, t}
The table will look like:
[ 0, 0, _, // label x
1, 1, _, // label y
2, _, 0, // label z
3, _, 1 ] // label t
Record IDs will be 0, 1, 2, and label IDs will be 0, 3, 6, 9.
We can use the unused slot for label x, record 2, by decrementing the label index for y
by one. If we then do the same for z
, the label IDs become 0, 2, 4, 7, and the table becomes:
[ 0, 0, // label x
1, 1, // label y
2, _, 0, // label z
3, _, 1 ] // label t
This idea can be used to fill any gaps in previous label rows, as long as the used slots in a row fits into the gaps. For example, if we have a table like:
[ 0, _, _, 1, // label x
_, 0, 1, _, // label y
... ]
We can decrement y
’s ID to fit it into the row for label x
:
[ 0, 0, 1, 1, // label x and y, interleaved
... ]
Collecting and numbering all labels in the program allows using a global table for mapping labels to offsets.
These offset tables can be made smaller by
The result is a very compact representation of record objects (no extra words in the header or unused space in the payload needed) and a fast polymorphic field access.
The offset table should also be small in practice, because different parts of the program will probably use disjoint set of names, and different labels and records will have the same IDs. In the remaining cases, tweaking label IDs to compact the table should help.
I’ve learned about the global table approach and some of the optimizations from the Dart compiler, which implements virtual calls using a “global dispatch table” (GDT), indexed by classID + methodID
in call sites. See “Introduction to Dart VM” for a description of how Dart AOT and JIT generate GDTs.
If you are interested in seeing some code, here is where we generate the GDT in dart2wasm (Dart’s Wasm backend). The outer loop finds a selector ID (label ID in our examples) for a row (list of records in our examples, list of classes in dart2wasm). The inner loop do { ... } while (!fits)
starts from the first row with gaps, and tries to fit the current row into the gaps. In the worst case it skips all of the rows, in which case rest of the code appends the table with the new row.
Dart will soon have records, and for the dart2wasm implementation of records I’m thinking of using some of the ideas described in this post. Dart records do not support width subtyping (you can’t pass {x, y, z}
where {x, y}
is expected), but because of the dynamic
type, we can have a dynamically typed record that we index.
Thanks to José Manuel Calderón Trilla for his feedback on a draft of this blog post.
]]>I recently tweeted about this and got helpful responses that made me realize that I got my definitions wrong. As I think more about what “anonymous type” means it became clear to me that the it’s not just tuples or other types with special syntax, instead of names. It’s more complicated than that.
So in this post I’d like to briefly talk about products and sums, and how are names used in type checking. I will then show a different way of type checking, and some examples from two widely used languages. Finally, I will argue that types are called “named” or “anonymous” depending on how they are checked.
Note that I’m not using any of these words as they are used in category theory or any other field of mathematics. These are mainly how I see them used in widely used PLs like Haskell, Rust, and OCaml, and in PL papers and books.
A value of a product type contains zero or more fields with potentially different types. Some example product types are:
data Coordinate = Coordinate { x :: Int, y :: Int }
: a product with two Int
fieldsdata D = D Int String Float
: a product with Int
, String
, and Float
fieldsdata Empty = Empty
: a product with no fieldsNote that the way you access the fields does not matter. In the examples above, fields of a Coordinate
value can be accessed with pattern matching, or with the generated functions x
and y
. In the second example, we can only access the fields with pattern matching.
What matters is: products contain zero or more fields. The fields can have different types.
A sum type specifies multiple “variants” (or “alternatives”), where each variant has a “name” (or “tag”, more on this later) and some number of fields.
A value of a sum type holds a name (or tag), and the fields of the variant with that name.
For example, if you have a parser for integers, you will want to return an integer when parsing succeeds, or an error message when something goes wrong. The sum type for the return value of your parse function would look like:
data ParseResult
= Success Int
| Fail String
Here, Success
and Fail
are names of the variants. Success
variant has an Int
field, and Fail
variant has a String
field.
A value of this type does not contain an Int
and String
at the same time. It’s either a Fail
with a String
field, or a Success
with an Int
field.
The way you access the fields is with pattern matching:
case parse_result of
Success int -> ...
Fail error_message -> ...
If I have two types, named T1
and T2
, no matter how they are defined, they are considered different in Haskell, and most other widely used typed languages (Rust, Java, …). This is called “nominal” type checking, where differently named types are considered different, even if they are “structurally” the same. For example, data T1 = T Int
and data T2 = T Int
are structurally the same, but you can’t apply a value of type T2
to a function that expects T1
.
What “structurally same” mean is open to interpretation. We will come to this later.
In addition, all types have names1, even types like tuples, which may look like they don’t have names, like our Coordinate
or ParseResult
have.
Tuples in most languages are just a bunch of product types, like the ones you can define yourself. They are often pre-defined for arities 0 to some number, and they have a special, “mixfix” syntax, with parentheses and commas to separate the fields. Other than that, they are no different than the ones you can define yourself.
You can see GHC’s definition of tuples here. In GHC, you can use the name directly if you don’t want the mixfix syntax, like (,) 1 2
. So the name for an 2-ary tuple is (,)
in Haskell, and it has a special syntax so you can write more readable (1, 2)
(or (Int, Int)
in type context). Other than syntax, there’s nothing special about tuples.
So it’s clear that most languages don’t have anonymous types. All types have some kind of names, and two types are only “compatible” if the names match.
Before defining what anonymous types are, I would like to give two examples, from PureScript and OCaml, where types are not checked based on their names, but based on their “structure”.
A record is a product type with named (or “labelled”) fields. Our Coordinate
example is a record.
In PureScript, records can be defined without giving names to them. For example:
f :: { x :: Int, y :: Int } -> Int
f a = a.x + a.y
Here, f
is a function that takes a record with two Int
fields, named x
and y
, as an argument.
Here is a more interesting version of the same function:
f :: forall r . { x :: Int, y :: Int | r } -> Int
f a = a.x + a.y
This version takes a record with at least x :: Int
and y :: Int
fields, but it can have more fields. Using this version, this code type checks:
f { x: 1, y: 2, z: 3, t: 4 }
The r
in this type is not too important. Important part is, in PureScript, records are not type checked nominally. Indeed, in the example above, type of the record with 4 fields is not defined, and no names are used for the record in the type signature of f
.
You might think that the record braces and commas are similar to the tuple syntax, so the name could be something like {,}
, maybe applied to x :: Int
somehow (assuming there is a type-level representation of field names).
However, even if that’s the case, type checking of these types are quite different than tuples. We’ve already seen that we can pass a record with more fields. You can also reorder fields in the function type signature2, or in the record expression, and it still works.
So type checking for PureScript is quite different than Haskell tuples.
This kind of type checking where you look at the “structure” rather than just the names is called structural type checking.
Now let’s take a look at an example for sum types.
OCaml has named sum types, just like Haskell’s. Here is the OCaml version of our ParseResult
type:
type parse_result =
of int
| Success of string | Fail
Name of this type is parse_result
(following OCaml naming conventions), and it is type checked exactly the same way it is type checked in Haskell.
A second way of defining sum types in OCaml, and without names, is with polymorphic variants. Here’s the polymorphic variant for the same type:
type parse_result = [ `Success of int | `Fail of string ]
Crucially, even though we use a similar syntax with the type
keyword, this is a type synonym. The right-hand side of this definition is an anonymous sum with two variants, tagged `Success
and `Fail
, with int
and string
fields, respectively.
Now, suppose I have a parse result handler, which, in addition to the success and failure cases, handles some “other” case as well:
let f = function
Printf.printf "Parse result: %d\n" i
| `Success i -> Printf.printf "Parse failed: %s\n" msg
| `Fail msg -> Printf.printf "Wat?\n" | `Other ->
Type of this function as inferred by the OCaml compiler is:
of string | `Other | `Success of x ] -> unit [< `Fail
What this type says is that the function accepts any polymorphic variant that has the tags Fail
, Other
, and Success
(with the specified field types), or some subset of these tags. So if I have a value of type parse_result
:
let x : parse_result = `Success 123
I can pass it to f
, even though f
’s argument type is not exactly parse_result
. Here’s the full example, run in utop: (utop #
part is the prompt, lines after ;;
are utop outputs)
type parse_result = [ `Success of int | `Fail of string ];;
utop # type parse_result = [ `Fail of string | `Success of int ]
let f = function
utop # Printf.printf "Parse result: %d\n" i
| `Success i -> Printf.printf "Parse failed: %s\n" msg
| `Fail msg -> Printf.printf "Wat?\n";;
| `Other -> val f : [< `Fail of string | `Other | `Success of int ] -> unit = <fun>
let x : parse_result = `Success 123;;
utop # val x : parse_result = `Success 123
utop # f x;;123
Parse result: unit = () - :
Neat!
Similar to PureScript records, and unlike Haskell tuples, type checking for OCaml polymorhic records is structural, not nominal.
Now that we have seen structural type checking as an alternative to name-based (nominal) type checking, and some examples, here is my attempt at defining anonymous types: If named types are type checked nominally, then the types that are structurally type checked are called “anonymous”.
In other words:
According to this definition, Haskell and many other languages don’t have anonymous types. PureScript records are an example to anonymous products, and OCaml polymorphic variants are an example to anonymous sums.
Named types are checked nominally, anonymous types are checked structurally. According to this definition, Haskell, and many other languages, don’t have anonymous types, as all types are nominally checked.
Tuples are no exception: they have names, and type checked nominally.
PureScript records and OCaml polymorphic variants are great examples to anonymous products and sums, respectively.
Thanks to @_gilmi and @madgen_ for their helpful comments on a draft of this blog post.
With the exception of type synonyms. Type synonyms can be considered as simple macros for substituting types for names before type checking.↩︎
In Haskell, reordering stuff at the type level is often done with type families (type-level functions). Types are still checked nominally, but by rearranging them before type checking you can often have something somewhat similar to structural checking.↩︎
no_std
crate that you want to use in two ways:
(1) is the main use case for this library. (2) is because you want to test this library and you want to be able to use Rust’s std
and other Rust libraries for testing.
The Rust crate type for (1) is staticlib
. For (2) you need rlib
. (documentation on crate types)
Here’s the problem. To be able to generate staticlib
you need to implement a panic handler as otherwise the code won’t know how to panic1. However, if you define a panic handler, you won’t be able to use your crate in other crates anymore as your panic handler will clash with the std
panic handler.
4 files needed to demonstrate this:
-- Cargo.toml for the library
[package]
name = "nostd_lib"
version = "0.1.0"
authors = []
edition = "2018"
[lib]
crate-type = ["staticlib", "rlib"]
[profile.dev]
panic = "abort"
[profile.release]
panic = "abort"
-- lib.rs
#![no_std]
#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
loop {}
}
-- Cargo.toml for the importing crate
[package]
name = "nostd_bin"
version = "0.1.0"
authors = []
edition = "2018"
[dependencies]
nostd_lib = { path = "../nostd_lib" }
-- main.rs
extern crate nostd_lib;
fn main() {}
The library builds fine, but if you try to build nostd_bin
you’ll get this error:
error: duplicate lang item in crate `nostd_lib` (which `nostd_bin` depends on): `panic_impl`.
|
= note: the lang item is first defined in crate `std` (which `nostd_bin` depends on)
= note: first definition in `std` loaded from ...
= note: second definition in `nostd_lib` loaded from ...
Which says you now have two panic handlers: one in std
and one in your library.
If you remove the panic handler in the library then you won’t be able to build the library anymore:
error: `#[panic_handler]` function required, but not found
So you need some kind of conditional compilation, to generate panic handler only when generating staticlib
. Unfortunately conditional compilation based on crate type is currently not possible. It is also not possible to specify target crate type when invoking cargo.
The least hacky way I could find to solve this (and without using anything other than just cargo build
to build) is by having two Cargo.toml
files.
Cargo really wants manifest files to be named Cargo.toml
, so we put the files in different directories. In my case the top-level one is for staticlib
and it looks like this:
[package]
name = "nostd_lib"
version = "0.1.0"
authors = []
edition = "2018"
[features]
default = ["panic_handler"]
panic_handler = []
[lib]
crate-type = ["staticlib"]
[profile.dev]
panic = "abort"
[profile.release]
panic = "abort"
I also update lib.rs
to only define the panic handler when the feature is enabled:
#[cfg(feature = "panic_handler")]
#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
...
}
Now I can build the library at the library’s top-level with just cargo build
. Because the panic_handler
feature is enabled by default in this Cargo.toml
, the panic handler will be defined by default with just cargo build
and static library will build and work fine.
For the rlib
I create a similar Cargo.toml
in rlib
directory:
[package]
name = "nostd_lib"
version = "0.1.0"
authors = []
edition = "2018"
[lib]
crate-type = ["rlib"]
path = "../src/lib.rs"
[profile.dev]
panic = "abort"
[profile.release]
panic = "abort"
The differences are: this one only generates rlib
, doesn’t define the panic_handler
feature, and specifies the library source path explicitly (as it’s not in the default location relative to this Cargo.toml
). It’s fine to refer to a feature that you never define in Cargo.toml
in your code, so lib.rs
is still fine, and the panic handler will never be built when you build the crate with this Cargo.toml
.
Now in the importing crate I use this Cargo.toml
instead of the top-level one:
[dependencies]
nostd_lib = { path = "../nostd_lib/rlib" }
And it works fine. The downside is I have two Cargo.toml
files now, but in my case that’s not a big deal, as my Cargo.toml
is quite small and have no dependencies other than libc
2.
I hope this is helpful. If you know any better way to do conditional compilation based on crate types, or to solve the problem of generating usable staticlib
and rlib
s from a single no_std
crate, let me know!
You need a panic_handler
even if you never panic in your crate (assuming that’s possible). For example, you can’t compile fn main() {}
with no_std
, panic=abort
, and without a panic_handler
: the compiler complains about the missing panic handler.↩︎
If you’re working on a no_std
crate I think you won’t be able to find a lot of libraries that you can use anyway.↩︎
Here’s the summary of my 8 years writing Haskell pretty much non-stop:
In 2012 I wrote my first Haskell program, which was a chat server. I was reading “Real World Haskell” and “Learn You a Haskell for Great Good!” at the time and applying what I learned on this project.
In the same year I implemented my first programming language in Haskell. I don’t remember much about this project, I think it may be just a few extensions over the excellent Haskell tutorial “Write Yourself a Scheme in 48 hours”.
Also in 2012 I made a few commits to the programming language Fay. This was my first contribution to an open source compiler not written by me.
In 2013 I worked on four PL implementations, two of which were implemented from scratch in Haskell: A Prolog implementation and a K Lambda interpreter.
The other two projects were: A multi-stage ML-like language written in OCaml, and K Framework (in Java).
In 2014 I was accepted to Google Summer of Code to work on adding stack traces to GHCJS. The project was successful, and I made 88 commits to GHCJS during this period.
This was my first introduction to GHC. I made only one commit to GHC during this time, but I started reading the RTS and code generator to be able to implement cost-centre stacks in GHCJS, which taught me a lot.
Also in 2014, I briefly worked at a startup where I wrote Haskell.
In 2015 I joined Indiana University to do PhD in programming languages. In my first semester I worked on the paper “Efficient Communication and Collection with Compact Normal Forms” which was about a GHC extension. The paper was published the same year at ICFP.
In the same year I briefly worked on a torrent client in Haskell.
According to git logs, 2015 was the year where I started making some larger commits to GHC. I think I made a few dozen commits that year. What was happening in the background is that I was working on unboxed sums. At Haskell Implementors Workshop in 2015 my advisor gave a presentation on efficiency of data representation in Haskell. I don’t remember how the story developed, but I think we also talked to a few people at ICFP on how to improve the situation, and one of the idea that came up was unboxed sums. IIRC I started working on it soon after returning from ICFP.
The first somewhat working version was implemented as a plugin, using lots of unsafe coercions under the hood. It was good enough to run some examples.
(In 2015, I also studied various metaprogramming and partial evaluation ideas quite extensively. If you look at my blog posts published in 2015 you’ll see a lot of related blog posts. There are also a few related git repositories in my Github page. I also gave a related talk at HIW 2015.)
Early 2016, I don’t remember what I was doing in too much detail. I remember taking an advanced OS class around that time and enjoying it very much. This was also the time where I started to realize that the tools I’m using (mostly GHC) are full of bugs, and very inefficient. I kept studying program transformation ideas, with the goal of making Haskell “fast”. I also started using C more, partly for the OS class, but also in my hobby projects. For example, the first commit of tiny was made in January 2016 and the code was in C.
In mid-2016 I left Bloomington for Cambridge, UK, for an internship at Microsoft Research with SPJ. We mainly worked on implementing unboxed sums properly in the compiler (instead of as a hacky plugin), but I also did a lot of GHC maintenance work there with supervision of SPJ.
Unboxed sums was merged during my time at MSR.
In the rest of the internship I did a lot of reading, did GHC maintenance, and biked around Cambridge.
Most importantly, during my time at MSR I realized that I’m no longer interested in academic research. I don’t enjoy writing papers. I don’t feel like pushing a field forward while most of the tools I use every day are badly broken, inefficient, usually both. I started having job interviews while I was in the UK. I visited two companies for interviews, one in London, another one in Cambridge.
I also emailed my advisor, saying that I don’t want to come back to Bloomington.
Job interviews went badly, and I was back at Indiana University. Rest of 2016 was pretty horrible. I was depressed. I had no interest in research. I still helped publishing a paper, but I did not enjoy the process.
I still spent my last semester somewhat productively. I took enough classes this semester to leave IU with a masters degree, instead of empty handed (I was a PhD student, not masters). I also had some good job interviews and met good people from the Haskell community.
By the end of 2016 I accepted a job offer and left IU with masters degree to write Haskell for a startup.
In 2017 I worked for this startup for a year. I wrote lots of networking and concurrent code, and learned a lot about these topics and exception handling in Haskell. Until this my Haskell experience was mainly in the context of compilers, so this was quite educational for me.
I left the company at the end of that year to join Well-Typed to work on GHC full-time.
My time at Well-Typed was great, but also full of challenges, mainly related to working remotely.
I worked on GHC between 30 and 40 hours a week (some weeks as little as 24 hours, but no less than that). Few weeks after I joined I started working on a new garbage collector with a colleague. When I joined the project there were only type definitions in header files, and almost no code. I implemented the first sequential prototype of the new collector. After that we started collaborating more closely with my colleague while implementing the concurrent version. We found many bugs in both the design and implementation, and sorted out many edge cases during this time. I thoroughly enjoyed working on this project, even though it was clearly the most challenging project I ever worked on.
After the garbage collector I kept working as a maintainer until I left the company on a Sunday, Jun 21st, 2020. I made my last commit to a merge request that I was working on 21st.
On 22 Jun 2020 I joined DFINITY to work on the Motoko programming language, and this is where the story ends.
At the time of this writing I have 383 commits to GHC and I’m the 14th contributor with most commits. It feels bad to leave a project that I liked and contributed so much, but it’s also the right thing to do. After the GC was merged I started spending my time less and less productively, for many reasons, and I had lost my motivation to improve Haskell-the-language and GHC. Perhaps I can write more about these in another post.
]]>mmap(NULL, ...)
calls I can do
break mmap if addr == 0
and gdb doesn’t break on mmap
when the addr == 0
condition doesn’t hold.
I’ve used this many times to great effect, but it’s not always sufficient, sometimes I need to break not when a variable or argument has a specific value but the function is called (directly or indirectly) from another function. For example, when debugging a GHC RTS issue I sometimes want to inspect mmap
calls made by the garbage collector.
As far as I know this is not possible using the standard break
syntax, but gdb provides a Python API that allows setting breakpoints with conditions implemented in Python. Using this API it’s takes a few lines to implement this:
class FrameBp(gdb.Breakpoint):
def __init__(self, spec, *args, frame=None, **kwargs):
self.frame = frame
super(FrameBp, self).__init__(spec, *args, **kwargs)
def stop (self):
= gdb.selected_frame().older()
frame
while frame:
if frame.name() == self.frame:
return True
= frame.older()
frame
return False
When calling the constructor the first argument is the breakpoint specifier, which is basically the part after break ...
in gdb’s break command. The frame
argument is the function we look for before actually breaking. We only break if the function exists in the backtrace. Here’s an example use:
>>> python FrameBp("mmap", frame="GarbageCollect")
Breakpoint 1 at 0x7f3366243f00: file ../sysdeps/unix/sysv/linux/mmap64.c, line 44.
This will only break on mmap
if the backtrace has GarbageCollect
at some point. An example backtrace when the breakpoint is hit:
Breakpoint 1, __GI___mmap64 (addr=0x4200200000, len=1048576, prot=3, flags=50, fd=-1, offset=0) at ../sysdeps/unix/sysv/linux/mmap64.c:44
44 if (offset & MMAP_OFF_MASK)
>>> bt
#0 __GI___mmap64 (addr=0x4200200000, len=1048576, prot=3, flags=50, fd=-1, offset=0) at ../sysdeps/unix/sysv/linux/mmap64.c:44
...
#19 0x0000000003022c83 in GarbageCollect (collect_gen=0, do_heap_census=false, deadlock_detect=false, gc_type=0, cap=0x37ef500
<MainCapability>, idle_cap=0x0) at rts/sm/GC.c:449
...
With some effort you could probably turn this into a proper gdb command and run it without the python ...
part, but so far this works good enough for me.
It’s also shared on Twitter and /r/haskell. If you have any questions/comments feel free to ping me in any of these places, or add a comment below!
]]>In this post I’m going to give two more examples, using the same expression representation from the previous post, and then talk about how to implement our passes using a different representation, without knot-tying.
Previously we attached arity and unfolding information to Id
s. Now suppose that our language is typed, and up to some point our transformations rely on typing information. Similar to arity and unfolding fields we add one more field to Id
:
data Id = Id
..
{ idType :: Maybe Type
, }
The Maybe
part is because when we no longer need the types we want to be able to clear the type fields to make the AST smaller. While we have only one heap object per Id
, in an average program there’s still a lot of different Id
s, and Type
representation can get quite large, so this is worthwhile. This makes the working set smaller, which causes less GC work and improves compiler performance.
In our cyclic AST representation the only way to implement this without losing sharing is with a full-pass over the entire program, using knot-tying. The code is similar to the ones in the previous post.
Remember that in the previous post we represented the AST as:
data Expr
= IdE Id
| IntE Int
| Lam Id Expr
| App Expr Expr
| IfE Expr Expr Expr
| Let Id Expr Expr
data Id = Id
idName :: String
{-- ^ Unique name of the identifier
idArity :: Int
,-- ^ Arity of a lambda. 0 for non-lambdas.
idUnfolding :: Maybe Expr
,-- ^ RHS of a binder, used for inlining
}
In this representation if I have a recursive definition like
let fac = \x . if x then x * fac (x - 1) else 1 in fac 5
In fac
used in lambda body I want to be able to do idUnfolding
and get the definition of this lambda. So the lambda refers to the Id
for fac
, and fac
refers to the lambda in its idUnfolding
field, forming a cycle.
In this representation only way to implement this is with knot-tying. An implementation that maintains a map from binders to their RHSs to update unfoldings of Id
s in occurrence position does not work, because when we update an occurrence of the binder in its own RHS (i.e. in a recursive let
) we end up invalidating the RHS
that we’ve added to the map.
Here’s a knot-tying implementation that adds unfoldings (only the interesting bits):
addUnfoldings :: Expr -> Expr
= go M.empty
addUnfoldings where
go :: M.Map String Id -> Expr -> Expr
= case e of
go ids e
IdE id ->
IdE (fromMaybe id (M.lookup (idName id) ids))
Let bndr rhs body ->
let
= M.insert (idName bndr) bndr' ids
ids' = go ids' rhs
rhs' = bndr{ idUnfolding = Just rhs' }
bndr' in
Let bndr{ idUnfolding = Just rhs' } rhs' (go ids' body)
...
As before we tie the knot in let
case and use it in Id
case.
It’s also possible to initialize idUnfolding
fields when parsing, using monadic knot-tying (MonadFix). Full code is shown at the end of this post, but the interesting bit is when parsing let
s and Id
s:
parseLet :: Parser Expr
= do
parseLet <- string "let"
_ <- parseIdName
id_name <- char '='
_
id, rhs) <- mfix $ \ ~(id_, _rhs) -> do
(
modify (Map.insert id_name id_)<- parseExpr
rhs return (Id{ idName = id_name, idArity = 0, idUnfolding = Just rhs }, rhs)
<- string "in"
_ <- parseExpr
body return (Let id rhs body)
parseId' :: Parser Id
= do
parseId' <- parseIdName
name <- get
id_map let def = Id{ idName = name, idArity = 0, idUnfolding = Nothing }
return (fromMaybe def (Map.lookup name id_map))
The idea is very similar. When parsing a let
we add a thunk for the binder with correct unfolding to a map. The map is then used when parsing Id
s in the RHS and body of the let
.
A well-known way of associating information with identifiers in a compiler is by using a “symbol table”. Instead of adding information about Id
s directly in the Id
fields, we maintain a table (or multiple tables) that map Id
s to the relevant information. Here’s one way to do this in our language:
data Expr
= IdE String
...
data IdInfo = IdInfo
idArity :: Int
{-- ^ Arity of a lambda. 0 for non-lambdas.
idUnfolding :: Maybe Expr
,-- ^ RHS of a binder, used for inlining
}
type SymTbl = Map.Map String IdInfo
In this representation we have to refer to the table for idArity
or idUnfolding
. That’s slightly more work than the previous representation where we could simply use the fields of an Id
, but a lot of other things become much simpler and efficient.
Here’s dropUnusedBindings
in this representation (only the interesting bits, full code is at the end of this post):
dropUnusedBindings :: Expr -> State SymTbl Expr
=
dropUnusedBindings fmap snd . go Set.empty
where
go :: Set.Set String -> Expr -> State SymTbl (Set.Set String, Expr)
= case e0 of
go free_vars e0
Let bndr e1 e2 -> do
<- go free_vars e2
(free2, e2') if Set.member bndr free2 then do
<- go free_vars e1
(free1, e1')
setIdArity bndr (countLambdas e1')return (Set.delete bndr (Set.union free1 free2), Let bndr e1' e2')
else
return (free2, e2')
...
Our pass is now stateful (updates the symbol table) and written in monadic style. Knot-tying is gone. We update the symbol table after processing a let
RHS. Because Id
s no longer have the arity information we don’t need to update anything other than the symbol table.
It’s now trivial to implement addUnfoldings
:
addUnfoldings :: Expr -> State SymTbl ()
= case e0 of
addUnfoldings e0
IdE{} ->
return ()
IntE{} ->
return ()
Lam arg body ->
addUnfoldings body
App e1 e2 -> do
addUnfoldings e1
addUnfoldings e2
IfE e1 e2 e3 -> do
addUnfoldings e1
addUnfoldings e2
addUnfoldings e3
Let bndr e1 e2 -> do
addUnfoldings e1
addUnfoldings e2 setIdUnfolding bndr e1
Doing it during parsing is also trivial, and shown in the full code at the end of this post. Updating typing information when we no longer need them is simply
dropTypes :: State SymTbl ()
= modify (Map.map (\id_info -> id_info{ idType = Nothing })) dropTypes
We could also maintain a separate table for typing information, in which case all we had to do would be to stop using that table.
Easy!
Cyclic AST representation in a purely functional language necessitates knot-tying and relies on lazy evaluation. A well-known alternative is using symbol tables. It works across languages (does not rely on lazy evaluation) and keeps the code simple.
Cyclic representations make using the information easier, while symbol tables make updating easier. Code for updating the information is shown above and the previous post. For using the information, compare:
-- Get the information in a cyclic representation
... (idUnfolding id) ...
-- Get the information using a symbol table
arity <- getIdUnfolding id
To me the monadic version is not too bad in terms of verbosity or convenience, especially because Haskell makes state passing so easy.
Some of the problems with knot-tying is as explained at the end of the previous post. What I did not mention in the previous post is the problems with efficiency, which are demonstrated better in this post.
In the “typing information” example, with the cyclic representation I need to copy the entire AST to update every single Id
occurrence and binder. With the symbol table I need to update just the table, which is much smaller than the AST.
In the unfolding example, with the cyclic representation I again need to copy the entire AST or use MonadFix
if I’m doing it in parsing. With a symbol table the pass does not update the AST, only updates the table. If I’m doing it in parsing then I simply add an entry to the table after parsing a let
. (full code at the end of this post)
In use sites, getIdArity
(a map lookup) does more work than idArity
(just follows a pointer). While I don’t have any benchmarks on this, I doubt that this is bad enough to make cyclic representation and knot-tying preferable.
Examples in these two posts are inspired by GHC:
Id
s in an Id
field with type IdInfo
.IdInfo
type holds information like arity and unfolding.Id
has another field: varType
.IdInfo
s with code generator-generated information.In the first post I mostly argued that knot-tying makes things more complicated, and in this post I showed that knot-tying is necessary because of the cyclic representation. If we want to do the same without knot-tying we either have to introduce mutable references (e.g. IORef
s) in our AST (not shown in this post), or have to use a non-cyclic representation with symbol tables.
Between these two representations, I think non-cyclic representation with symbol tables is a better choice.
Full code (knot-tying)
-- Tried with GHC 8.6.4
{-# OPTIONS_GHC -Wall #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE FlexibleInstances #-}
import Data.List
import Data.Maybe
import Prelude hiding (id)
-- mtl-2.2
import Control.Monad.State
-- containers-0.6
import qualified Data.Map as Map
import qualified Data.Set as Set
-- megaparsec-7.0
import Text.Megaparsec hiding (State)
import Text.Megaparsec.Char
-- pretty-show-1.10
import Text.Show.Pretty
data Expr
= IdE Id
| IntE Int
| Lam Id Expr
| App Expr Expr
| IfE Expr Expr Expr
| Let Id Expr Expr
deriving (Show)
data Id = Id
idName :: String
{-- ^ Unique name of the identifier
idArity :: Int
,-- ^ Arity of a lambda. 0 for non-lambdas.
idUnfolding :: Maybe Expr
,-- ^ RHS of a binder, used for inlining
}
instance Show Id where
show (Id name arity _) = "(Id " ++ show name ++ " " ++ show arity ++ ")"
--------------------------------------------------------------------------------
-- Initializing unfolding fields in parse time via MonadFix
type IdMap = Map.Map String Id
type Parser = ParsecT String String (State IdMap)
parseExpr :: Parser Expr
= do
parseExpr <- some $
exprs $
choice map (\p -> p <* space)
[ parseParens, parseIf, parseLam, parseInt,
parseLet, try parseId ]return (foldl1' App exprs)
parseParens, parseIf, parseLam, parseInt, parseId :: Parser Expr
parseLet,
= do
parseParens <- char '('
_
space<- parseExpr
expr <- char ')'
_ return expr
= do
parseIf <- string "if"
_
space<- parseExpr
condE
<- string "then"
_
space<- parseExpr
thenE <- string "else"
_
space<- parseExpr
elseE return (IfE condE thenE elseE)
= do
parseLam <- char '\\'
_
spaceid <- parseId'
space<- char '.'
_
space<- parseExpr
body return (Lam id body)
= do
parseInt <- some digitChar
chars return (IntE (read chars))
= do
parseLet <- string "let"
_
space<- parseIdName
id_name
space<- char '='
_
space
id, rhs) <- mfix $ \ ~(id_, _rhs) -> do
(
modify (Map.insert id_name id_)<- parseExpr
rhs return (Id{ idName = id_name, idArity = 0, idUnfolding = Just rhs }, rhs)
<- string "in"
_
space<- parseExpr
body return (Let id rhs body)
= IdE <$> parseId'
parseId
kws :: Set.Set String
= Set.fromList ["if", "then", "else", "let", "in"]
kws
parseIdName :: Parser String
= do
parseIdName <- some letterChar
name not (Set.member name kws))
guard (return name
parseId' :: Parser Id
= do
parseId' <- parseIdName
name <- get
id_map let def = Id{ idName = name, idArity = 0, idUnfolding = Nothing }
return (fromMaybe def (Map.lookup name id_map))
testPgm :: String -> Expr
=
testPgm pgm case evalState (runParserT parseExpr "" pgm) Map.empty of
Left (err_bundle :: ParseErrorBundle String String) ->
error (errorBundlePretty err_bundle)
Right expr ->
expr
instance ShowErrorComponent [Char] where
= x
showErrorComponent x
--------------------------------------------------------------------------------
-- Initializing unfoldings with knot-tying
addUnfoldings :: Expr -> Expr
= go Map.empty
addUnfoldings where
go :: Map.Map String Id -> Expr -> Expr
= case e of
go ids e
-- Interesting bits ------------------------------------------------------
IdE id ->
IdE (fromMaybe id (Map.lookup (idName id) ids))
Let bndr rhs body ->
let
= Map.insert (idName bndr) bndr' ids
ids' = go ids' rhs
rhs' = bndr{ idUnfolding = Just rhs' }
bndr' in
Let bndr{ idUnfolding = Just rhs' } rhs' (go ids' body)
--------------------------------------------------------------------------
IntE{} ->
e
Lam arg body ->
Lam arg (go ids body)
App e1 e2 ->
App (go ids e1) (go ids e2)
IfE e1 e2 e3 ->
IfE (go ids e1) (go ids e2) (go ids e3)
Full code (symbol table)
-- Tried with GHC 8.6.4
{-# OPTIONS_GHC -Wall #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE FlexibleInstances #-}
import Data.List
import Data.Maybe
import Prelude hiding (id)
-- mtl-2.2
import Control.Monad.State
-- containers-0.6
import qualified Data.Map as Map
import qualified Data.Set as Set
-- megaparsec-7.0
import Text.Megaparsec hiding (State)
import Text.Megaparsec.Char
-- pretty-show-1.10
import Text.Show.Pretty
import Debug.Trace
data Expr
= IdE String
| IntE Int
| Lam String Expr
| App Expr Expr
| IfE Expr Expr Expr
| Let String Expr Expr
deriving (Show)
data IdInfo = IdInfo
idArity :: Int
{-- ^ Arity of a lambda. 0 for non-lambdas.
idUnfolding :: Maybe Expr
,-- ^ RHS of a binder, used for inlining
idType :: Maybe Type
,-- ^ Type of the id.
}
data Type = Type -- Assume a large type
instance Show IdInfo where
show (IdInfo arity _ _) = "(IdInfo " ++ show arity ++ ")"
type SymTbl = Map.Map String IdInfo
getIdInfo :: String -> State SymTbl (Maybe IdInfo)
id =
getIdInfo id <$> get
Map.lookup
setIdArity :: String -> Int -> State SymTbl ()
id arity = modify (Map.alter alter id)
setIdArity where
Nothing =
alter Just IdInfo{ idArity = arity, idUnfolding = Nothing, idType = Nothing }
Just id_info) =
alter (Just id_info{ idArity = arity }
setIdUnfolding :: String -> Expr -> State SymTbl ()
id unfolding = modify (Map.alter alter id)
setIdUnfolding where
Nothing =
alter Just IdInfo{ idUnfolding = Just unfolding, idArity = 0, idType = Nothing }
Just id_info) =
alter (Just id_info{ idUnfolding = Just unfolding }
countLambdas :: Expr -> Int
Lam _ rhs) = 1 + countLambdas rhs
countLambdas (= 0
countLambdas _
dropUnusedBindings :: Expr -> State SymTbl Expr
=
dropUnusedBindings fmap snd . go Set.empty
where
go :: Set.Set String -> Expr -> State SymTbl (Set.Set String, Expr)
= case e0 of
go free_vars e0
IdE id ->
return (Set.insert id free_vars, e0)
IntE{} ->
return (free_vars, e0)
Lam arg body -> do
<- go free_vars body
(free_vars', body') return (Set.delete arg free_vars', Lam arg body')
App e1 e2 -> do
<- go free_vars e1
(free1, e1') <- go free_vars e2
(free2, e2') return (Set.union free1 free2, App e1' e2')
IfE e1 e2 e3 -> do
<- go free_vars e1
(free1, e1') <- go free_vars e2
(free2, e2') <- go free_vars e3
(free3, e3') return (Set.unions [free1, free2, free3], IfE e1' e2' e3')
Let bndr e1 e2 -> do
<- go free_vars e2
(free2, e2') if Set.member bndr free2 then do
<- go free_vars e1
(free1, e1') return ())
trace (ppShow e1') (
setIdArity bndr (countLambdas e1')return (Set.delete bndr (Set.union free1 free2), Let bndr e1' e2')
else
return (free2, e2')
addUnfoldings :: Expr -> State SymTbl ()
= case e0 of
addUnfoldings e0
IdE{} ->
return ()
IntE{} ->
return ()
Lam _ body ->
addUnfoldings body
App e1 e2 -> do
addUnfoldings e1
addUnfoldings e2
IfE e1 e2 e3 -> do
addUnfoldings e1
addUnfoldings e2
addUnfoldings e3
Let bndr e1 e2 -> do
addUnfoldings e1
addUnfoldings e2
setIdUnfolding bndr e1
dropTypes :: State SymTbl ()
= modify (Map.map (\id_info -> id_info{ idType = Nothing }))
dropTypes
pgm :: Expr
= Let "fac" rhs body
pgm where
= Lam "x" (IfE (IdE "x") (App (App (IdE "*") (IdE "x"))
rhs App (IdE "fac")
(App (App (IdE "-") (IdE "x")) (IntE 1))))
(IntE 1))
(= App (IdE "fac") (IntE 5)
body
--------------------------------------------------------------------------------
-- Initializing unfolding fields in parse time, the boring way
type Parser = ParsecT String String (State SymTbl)
parseExpr :: Parser Expr
= do
parseExpr <- some $
exprs $
choice map (\p -> p <* space)
[ parseParens, parseIf, parseLam, parseInt,
parseLet, try parseId ]return (foldl1' App exprs)
parseParens, parseIf, parseLam, parseInt, parseId :: Parser Expr
parseLet,
= do
parseParens <- char '('
_
space<- parseExpr
expr <- char ')'
_ return expr
= do
parseIf <- string "if"
_
space<- parseExpr
condE
<- string "then"
_
space<- parseExpr
thenE <- string "else"
_
space<- parseExpr
elseE return (IfE condE thenE elseE)
= do
parseLam <- char '\\'
_
spaceid <- parseId'
space<- char '.'
_
space<- parseExpr
body return (Lam id body)
= do
parseInt <- some digitChar
chars return (IntE (read chars))
= do
parseLet <- string "let"
_
spaceid <- parseId'
space<- char '='
_
space<- parseExpr
rhs <- string "in"
_
space<- parseExpr
body id rhs)
lift (setIdUnfolding return (Let id rhs body)
= IdE <$> parseId'
parseId
kws :: Set.Set String
= Set.fromList ["if", "then", "else", "let", "in"]
kws
parseId' :: Parser String
= do
parseId' <- some letterChar
name not (Set.member name kws))
guard (return name
testPgm :: String -> Expr
=
testPgm pgm case evalState (runParserT parseExpr "" pgm) Map.empty of
Left (err_bundle :: ParseErrorBundle String String) ->
error (errorBundlePretty err_bundle)
Right expr ->
expr
instance ShowErrorComponent [Char] where
= x showErrorComponent x
data Expr
= IdE Id
| IntE Int
| Lam Id Expr
| App Expr Expr
| IfE Expr Expr Expr
| Let Id Expr Expr
When generating code, for an identifier that stands for a lambda, I want to know the arity of the lambda, so that I can generate more efficient code. While in this language a lambda takes only one argument, if I have something like
let f = \x . \y . \z . ...
in ...
I consider f
as having arity 3.
One way to implement this is having this information attached to every Id
:
data Id = Id
idName :: String
{-- ^ Unique name of the identifier
idArity :: Int
,-- ^ Arity of a lambda. 0 for non-lambdas.
}
This way of associating information to Id
s makes some things very simple. For example, if I’m generating code for this application:
f 1 2
In AST:
App (App (IdE (Id { idName = "f", idArity = 3 })) (IntE 1)) (IntE 2)
I can simply use the idArity
field to see the arity of the function being applied. It doesn’t get any simpler than this.
In a program we usually have many references to a single Id, whether it’s for a top-level function or an argument. If we allocate an Id for every occurrence that’s a lot of redundant allocations that make the AST representation larger, and affects compiler performance.
For example, if I have this expression:
f x + f y
A naive representation of this would be
App
App
(IdE Id { idName = "+" , idArity = 2 })
(App
(IdE Id { idName = "f" , idArity = 0 })
(IdE Id { idName = "z" , idArity = 0 })))
(App
(IdE Id { idName = "f" , idArity = 0 })
(IdE Id { idName = "t" , idArity = 0 })) (
Here for every occurrence of f
we have a new Id
, and these Id
s all have the same arity. This is two Id
heap objects used for the same identifier.
A more efficient representation would be
let f = Id { idName = "f", idArity = 0 } in
App
App
(IdE Id { idName = "+" , idArity = 2 })
(App
(IdE f)
(IdE Id { idName = "z" , idArity = 0 })))
(App
(IdE f)
(IdE Id { idName = "t" , idArity = 0 })) (
Here we only have one heap object for f
, and all uses refer to that one object.
This is actually not hard to fix: we maintain a map from Id
names to the actual Id
s. When we see a let
we add the LHS to the map. When we see an identifier we lookup. Easy.
Suppose I want to implement a pass that drops unused bindings. For example:
let f = let a = e1
in \x . e2
in f z + f t
Here if e2
doesn’t use a
I want to drop the binding:
let f = \x . e2
in f z + f t
The AST for the original program is:
Let
Id { idName = "f" , idArity = 0 }
Let
(Id { idName = "a" , idArity = 0 }
<e1>
Lam Id { idName = "x" , idArity = 0 } <e2>))
(App
(App
(IdE Id { idName = "+" , idArity = 2 })
(App
(IdE Id { idName = "f" , idArity = 0 })
(IdE Id { idName = "z" , idArity = 0 })))
(App
(IdE Id { idName = "f" , idArity = 0 })
(IdE Id { idName = "t" , idArity = 0 }))) (
Here’s a naive implementation of this pass:
dropUnusedBindings :: Expr -> Expr
= snd . go Set.empty
dropUnusedBindings where
= case e0 of
go free_vars e0
IdE id ->
id) free_vars, e0)
(Set.insert (idName
IntE{} ->
(free_vars, e0)
Lam arg body ->
Lam arg)
bimap (Set.delete (idName arg)) (
(go free_vars body)
App e1 e2 ->
let
= go free_vars e1
(free1, e1') = go free_vars e2
(free2, e2') in
App e1' e2')
(Set.union free1 free2,
IfE e1 e2 ->
let
= go free_vars e1
(free1, e1') = go free_vars e2
(free2, e2') = go free_vars e3
(free3, e3') in
IfE e1' e2' e3')
(Set.unions [free1, free2, free3],
Let bndr e1 e2 ->
let
= first (Set.delete (idName bndr)) (go free_vars e1)
(free1, e1') = go free_vars e2
(free2, e2') in
if Set.member (idName bndr) free2
then (Set.delete (idName bndr) (Set.union free1 free2),
Let (updateIdArity bndr e1') e1' e2')
else (free2, e2')
updateIdArity :: Id -> Expr -> Id
id rhs = id{ idArity = countLambdas rhs }
updateIdArity
countLambdas :: Expr -> Int
Lam _ rhs) = 1 + countLambdas rhs
countLambdas (= 0 countLambdas _
The problem with this pass is that it changes arity of binders, but doesn’t update the idArity
s of occurrences. Here’s what I get if I run this over the original AST:
Let
Id { idName = "f" , idArity = 1 }
Lam Id { idName = "x" , idArity = 0 } <e2>)
(App
(App
(IdE Id { idName = "+" , idArity = 2 })
(App
(IdE Id { idName = "f" , idArity = 0 })
(IdE Id { idName = "z" , idArity = 0 })))
(App
(IdE Id { idName = "f" , idArity = 0 })
(IdE Id { idName = "t" , idArity = 0 }))) (
Note how f
, which was not a lambda binder previously, became a lambda binder with arity 1. The pass correctly updated f
’s idArity
in the binder position, but it did not update it in the occurrences! Indeed, in this representation it’s not easy to do this efficiently.
Even if we solved the first problem and had only one closure for f
, the updateIdArity
step in this pass allocates a new Id
and loses sharing. So we would end up with something like:
let f = Id { idName = "f", idArity = 0 } in
Let
Id { idName = "f" , idArity = 1 }
Lam Id { idName = "x" , idArity = 0 } <e2>)
(App
(App
(IdE Id { idName = "+" , idArity = 2 })
(App
(IdE f)
(IdE Id { idName = "z" , idArity = 0 })))
(App
(IdE f)
(IdE Id { idName = "t" , idArity = 0 }))) (
The arity of f
in the use sites are still wrong, and we lost sharing.
Knot-tying is a way of solving both of these in one step. I find it quite hard to explain in words so I’ll show the code (only the interesting bits):
dropUnusedBindings :: Expr -> Expr
=
dropUnusedBindings snd . go Map.empty Set.empty
where
go :: Map.Map String Id -> Set.Set String -> Expr -> (Set.Set String, Expr)
= case e0 of
go binders free_vars e0
IdE id ->
id) free_vars, IdE (fromMaybe id (Map.lookup (idName id) binders)))
(Set.insert (idName
Let bndr@Id{ idName = bndr_name } e1 e2 ->
let
= updateIdArity bndr e1'
bndr' = Map.insert bndr_name bndr' binders
binders' = first (Set.delete bndr_name) (go binders' free_vars e1)
(free1, e1') = go binders' free_vars e2
(free2, e2') in
if Set.member bndr_name free2
then (Set.delete bndr_name (Set.union free1 free2),
Let bndr' e1' e2')
else (free2, e2')
...
The differences from the original version:
We now pass around a “binders” map that maps identifier names to actual Id
s. This is used to common-up uses of identifiers with one shared heap object with correct arity info.
In IdE
case we now do lookup on this map, and replace the Id
with the shared Id
with correct arity info from the map.
The tricky bit is the Let
case where we have a cyclic group of let bindings. binders'
is the binder map with bndr
with correct arity information. However to be able to generate that map we first need to process e1
, and while processing e1
we want to replace any occurrences of bndr
with correct Id
too! This gives us the cyclic bindings:
= updateIdArity bndr e1'
bndr' = Map.insert bndr_name bndr' binders
binders' ..., e1') = ... (go binders' free_vars e1) (
This technique relies heavily on lazy evaluation. In the original example the AST is not recursive, but suppose we also want to record RHSs of let binders in Id
s, to be used for inlining:
data Id = Id
...
{ idUnfolding :: Maybe Expr
,-- ^ RHS of a let binding, used for inlining
}
Now once we implement sharing (solving problem 1) ASTs with recursive definitions will become cyclic. A simple example:
let fac = \x . if x then x * fac (x - 1) else 1 in fac 5
This will be represented as something like
= Let fac_id rhs body
pgm where
= Id { idName = "fac", idArity = 0, idUnfolding = Just rhs }
fac_id = Lam x_id (IfE (IdE x_id)
rhs App (App (IdE star_id) (IdE x_id))
(App (IdE fac_id) (App (App (IdE minus_id) (IdE x_id))
(IntE 1))))
(IntE 1))
(= App (IdE fac_id) (IntE 5)
body
= Id { idName = "x", idArity = 0, idUnfolding = Nothing }
x_id = Id { idName = "*", idArity = 2, idUnfolding = Nothing }
star_id = Id { idName = "-", idArity = 2, idUnfolding = Nothing } minus_id
Here fac_id
refers to rhs
, which refers to fac_id
, forming a cycle.
The knot-tying implementation of dropUnusedBindings
works even in cases like this. We just need to update updateIdArity
to update the unfolding, when it’s available:
updateIdArity :: Id -> Expr -> Id
id rhs =
updateIdArity id{ idArity = countLambdas rhs
= idUnfolding id $> rhs } , idUnfolding
This is a bit hard to try, but if I implement a Show
instance for Id
that doesn’t print the unfolding (to avoid looping), make fac_id
’s arity 0
, and call dropUnusedBindings
this is the AST I get:
Let
Id "fac" 1)
(Lam
(Id "x" 0)
(IfE
(IdE (Id "x" 0))
(App
(App (IdE (Id "*" 2)) (IdE (Id "x" 0)))
(App
(IdE (Id "fac" 1))
(App (App (IdE (Id "-" 2)) (IdE (Id "x" 0))) (IntE 1))))
(IntE 1)))
(App (IdE (Id "fac" 1)) (IntE 5)) (
All uses of fac
have correct arity! Similarly I can do something hacky like this in GHCi to check that the unfolding has correct arity for uses of fac
too:
ghci> let Let lhs _ _ = dropUnusedBindings pgm
ghci> putStrLn (ppShow (idUnfolding lhs))
Just
(Lam
(Id "x" 0)
(IfE
(IdE (Id "x" 0))
(App
(App (IdE (Id "*" 2)) (IdE (Id "x" 0)))
(App
(IdE (Id "fac" 1))
(App (App (IdE (Id "-" 2)) (IdE (Id "x" 0))) (IntE 1))))
(IntE 1)))
Nice!
The main problem with this technique is that it’s very difficult to understand. Even after working on different knot-tying code in GHC and implementing my own knot-tying passes, the recursive let bindings in the Let
case above is still mind-boggling to me.
Secondly, it’s really hard to reason about the evaluation order of things in knot-tying code. You might think that this shouldn’t be an issue in a purely functional implementation, but in my experience any non-trivial compiler pass, even when implemented in a purely functional style, still needs debugging. Even if it’s not buggy, you may want to trace the evaluation and print a few things to understand how the code works.
Knot-tying code makes this, which should be absolutely trivial in any reasonable code base, very difficult. If you end up evaluating just the right places with your print statements you end looping. For example, here’s our AST with a few bang patterns:
data Expr
= IdE !Id
| IntE Int
| Lam Id Expr
| App !Expr !Expr
| IfE Expr !Expr Expr
| Let Id Expr Expr
data Id = Id
idName :: String
{ idArity :: !Int
, }
If you run the same program above using this AST definition you’ll see that the pass now loops. Note that I’ve removed the idUnfolding
field just to demonstrate that this doesn’t happen because we have a loop in the AST.
It’s even more frustrating when what you’re debugging is a loop. You add a few prints, and scratch your head thinking why none of your prints are working even though the algorithm is clearly looping. What’s really happening is that the code is indeed looping, but for a different reason…
Finally, because making things more strict potentially breaks things, knot-tying makes fixing some memory leaks very hard. For example, we may have many passes on our AST, one of them being our knot-tying pass. Some of these passes may be very leaky, and instead of adding strict applications or bang patterns to dozens of places, we may want to add bangs to only a few places in the AST. But that, as demonstrated above, causes our knot-tying pass to loop.
GHC makes use of knot-tying extensively, which has always been one of the pain points for me since my first days contributing to GHC. I vaguely remember, I was a graduate student at Indiana University at the time, making my first contributions to GHC. I remember finding it refreshing to be able to simply do idType
and get type of an identifier in GHC, as opposed to using a symbol table, which I’d been doing in some of the other compilers I worked on in the past.
At the same time, I was constantly confused that my simple print statements added in some front-end pass makes the compiler loop. I had no idea what could be the reason. I had no idea that the thing I found so refreshing is also the reason why debugging and tracing were so much harder.
Suffice it to say, I don’t like knot-tying. If I had to use knot-tying in my project I’d probably reconsider how I represent my data instead. For example, if we simply used an unique number for our identifiers and maintained a symbol table to map the unique numbers to actual Id
s then we wouldn’t have cycles for recursive functions in the AST and wouldn’t need knot-tying. Updating something about an Id
would be a simple update in the symbol table.
Full code
-- Tried with GHC 8.6.4
{-# OPTIONS_GHC -Wall #-}
module Main where
import Data.Bifunctor
import Data.Functor
import Data.Maybe
import Prelude hiding (id)
-- containers-0.6
import qualified Data.Map as Map
import qualified Data.Set as Set
-- pretty-show-1.10
import Text.Show.Pretty
{-
data Expr
= IdE !Id
| IntE Int
| Lam Id Expr
| App !Expr !Expr
| IfE Expr !Expr Expr
| Let Id Expr Expr
| Placeholder String
deriving (Show)
data Id = Id
{ idName :: String
-- ^ Unique name of the identifier
, idArity :: !Int
-- ^ Arity of a lambda. 0 for non-lambdas.
}
-}
data Expr
= IdE Id
| IntE Int
| Lam Id Expr
| App Expr Expr
| IfE Expr Expr Expr
| Let Id Expr Expr
| Placeholder String
deriving (Show)
data Id = Id
idName :: String
{-- ^ Unique name of the identifier
idArity :: Int
,-- ^ Arity of a lambda. 0 for non-lambdas.
idUnfolding :: Maybe Expr
,-- ^ RHS of a binder, used for inlining
}
instance Show Id where
show (Id name arity _) = "(Id " ++ show name ++ " " ++ show arity ++ ")"
{-
f_id = Id { idName = "f", idArity = 0 }
a_id = Id { idName = "a", idArity = 0 }
x_id = Id { idName = "x", idArity = 0 }
z_id = Id { idName = "z", idArity = 0 }
t_id = Id { idName = "t", idArity = 0 }
plus_id = Id { idName = "+", idArity = 2 }
f_x_plus_f_y = (App (App (IdE plus_id) (App (IdE f_id) (IdE z_id)))
(App (IdE f_id) (IdE t_id)))
ast1 = Let f_id (Let a_id (Placeholder "e1") (Lam x_id (Placeholder "e2"))) f_x_plus_f_y
ast2 = Let a_id (Placeholder "e1")
(Let f_id (Lam x_id (Placeholder "e2"))
f_x_plus_f_y)
-}
updateIdArity :: Id -> Expr -> Id
id rhs =
updateIdArity id{ idArity = countLambdas rhs,
= idUnfolding id $> rhs }
idUnfolding
countLambdas :: Expr -> Int
Lam _ rhs) = 1 + countLambdas rhs
countLambdas (= 0
countLambdas _
dropUnusedBindings :: Expr -> Expr
=
dropUnusedBindings snd . go Map.empty Set.empty
where
go :: Map.Map String Id -> Set.Set String -> Expr -> (Set.Set String, Expr)
= case e0 of
go binders free_vars e0
IdE id ->
id) free_vars, IdE (fromMaybe id (Map.lookup (idName id) binders)))
(Set.insert (idName
IntE{} ->
(free_vars, e0)
Lam arg body ->
Lam arg)
bimap (Set.delete (idName arg)) (
(go binders free_vars body)
App e1 e2 ->
let
= go binders free_vars e1
(free1, e1') = go binders free_vars e2
(free2, e2') in
App e1' e2')
(Set.union free1 free2,
IfE e1 e2 e3 ->
let
= go binders free_vars e1
(free1, e1') = go binders free_vars e2
(free2, e2') = go binders free_vars e3
(free3, e3') in
IfE e1' e2' e3')
(Set.unions [free1, free2, free3],
Let bndr@Id{ idName = bndr_name } e1 e2 ->
let
= updateIdArity bndr e1'
bndr' = Map.insert bndr_name bndr' binders
binders' = first (Set.delete bndr_name) (go binders' free_vars e1)
(free1, e1') = go binders' free_vars e2
(free2, e2') in
if Set.member bndr_name free2
then (Set.delete bndr_name (Set.union free1 free2),
Let bndr' e1' e2')
else (free2, e2')
Placeholder{} ->
(free_vars, e0)
pgm :: Expr
= Let fac_id rhs body
pgm where
= Id { idName = "fac", idArity = 0, idUnfolding = Just rhs }
fac_id = Lam x_id (IfE (IdE x_id) (App (App (IdE star_id) (IdE x_id))
rhs App (IdE fac_id)
(App (App (IdE minus_id) (IdE x_id)) (IntE 1))))
(IntE 1))
(= App (IdE fac_id) (IntE 5)
body
= Id { idName = "x", idArity = 0, idUnfolding = Nothing }
x_id = Id { idName = "*", idArity = 2, idUnfolding = Nothing }
star_id = Id { idName = "-", idArity = 2, idUnfolding = Nothing }
minus_id
main :: IO ()
= putStrLn (ppShow (dropUnusedBindings pgm)) main
Thanks to Oleg Grenrus for reading a draft of this.
]]>This post is originally written in 11 January 2019. Because it is more of an angry rant than a constructive piece, I wasn’t sure at the time that publishing it is a good idea. However reading it again now, I see that it’s not directed at a person, a group, or a specific proposal/patch, so I think it shouldn’t be offensive to anyone and I should be able to publish it on my personal blog.
(original post starts below)
So I woke up at 5AM today and felt like writing about one of my frustrations. These are my personal opinions, and I don’t represent GHC HQ here.
At this point adding new syntax to GHC/Haskell is a bad idea. Before moving on to examples, here are some facts:
The language that GHC supports is incredibly complex. GHC 8.6.3 man page lists 115 language pragmas.
You just can’t have a good understanding of all of these features and know interactions of the proposed syntax with all combinations of these.
GHC is a complex and old compiler with parts that today no active contributor knows well. The compiler (ignoring all the libraries, the RTS, tools etc.) currently has 189,699 lines of code (ignoring comments and whitespace). That’s a lot of complexity to deal with.
When you propose a new syntax, what you’re actually proposing is:
Because you can’t predict all the interactions of your new syntax (conceptually, or in the implementation) your syntax will cause a ton of problems.
Those problems will sit there unfixed for months/years.
GHC maintainers barely have enough time and manpower to provide stable releases. 8.6.1 and 8.6.2 are completely broken (#15544, #15696, #15892), and 8.6.3 doesn’t work well on Windows.
You might not accept some of these, however in my experience these are facts. If you disagree with any of these let me know and I can elaborate.
I’ll have only two examples for now, because I don’t normally work on front-end parts of the compiler I don’t notice most of the problems.
#7253 proposed a tiny new syntax in GHCi. A few years later a new contributor picked it up and submitted a patch. This trivial new syntax later caused #11606, #12091, #15721. That’s 3 too many tickets for a trivial syntax that buys us so little. It also generated at least one SO question, and invalidated an answer to another SO question by making things more complicated.
The implementation is finally fixed by a frustrated maintainer, but the additional complexity (both in the implementation, and as the GHCi syntax to be explained to users) it added won’t be fixed.
This was proposed as a GHC proposal. It’s a trivial syntax change that in the best case can save 3 characters (including spaces). So far it generated two tickets: #16137, #16097. Even worse than the previous example is none of these tickets mention -XBlockArguments
, they don’t even use it! Yet the error messages got significantly worse because of it.
I think some of the extensions are quite useful. However I also think that at this point new syntax extensions are doing more harm than good. Problems from a maintainer’s point of view are as listed above (arguably maintainers’ problems are also users’ problems because they lead to poor product, but let’s ignore this aspect for now). Now I want to add one more problem, this time from a software developer/engineer’s point of view:
Here’s why. Now that we have two ways of using do
syntax:
-- (1)
atomically $ do
...
-- (2) with -XBlockArguments
atomically do
...
with my team I have to do one of these
(1) means wasting the team’s time and energy on endless bikeshedding. (2) means being inconsistent in the source code. Either way we lose.
You might argue that with good tooling (1) is not a problem, and I’d agree. However as we add new syntax the tooling story will only get worse. GHC Haskell syntax is already so complex we don’t even have a good formatter. We should first stop making it even more complex if we want the tooling story to get better.
In my opinion what we need is principles to guide the language and the compiler. Currently we don’t have this (last paragraph), and the result is 100+ pragmas, a buggy compiler, and frustrated users and maintainers.
If you’re proposing a new syntax; don’t! If you know someone who will, point them to this blog post.
]]>