April 24, 2023 - Tagged as: en, plt, ocaml.
Since 2013 I’ve had the chance to use OCaml a few times in different jobs, and I got frustrated and disappointed every time I had to use it. I just don’t enjoy writing OCaml.
In this post I want to summarize some of the reasons why I don’t like OCaml and why I wouldn’t choose it for a new project today.
To me it’s absolutely essential that the language should have some way of defining interfaces, implementing those interfaces for the types, and programming against those interfaces.
In Haskell, this is done with typeclasses. Rust has a similar mechanism called traits. In languages with classes this is often done with abstract classes and “implementing” those classes in new classes (e.g.
implements in Dart).
In OCaml there’s no way to do this. I have to explicitly pass functions along with my values, maybe in a product type, or with a functor, or as an argument.
Regardless of how I work around this limitation, it’s extremely inconvenient. Things that must be trivial in any code base, such as converting a value to a string for debugging purposes, become a chore, and sometimes even impossible.
As far as I know, there was at least one attempt at ameliorating this with modular implicits (implicit parameter passing), but I don’t know what happened to it since 2017. It looks like it’s still not a part of the language and the standard library is not using it.
OCaml’s standard library is just bizarre. It has lots of small issues, and a few larger ones. It’s really just extremely painful to use.
Some examples of the issues:
Zoo of printing/debugging and conversion functions such as
Overly polymorphic operators with type
'a -> 'a -> bool such as
= (called “structural equality”, throws an exception if you pass a function) and
>. Code that uses these operators will probably not work on user-defined types as expected.
Standard types are sometimes persistent, sometimes mutable.
Set are persistent.
Hashtbl are mutable.
cardinal, length function for
Bytes.t, the big int type is
Big_int.t). The functions in these modules are also inconsistently named.
Big_intfunctions are suffixed with
Bytesmodule functions are not prefixed or suffixed.
The regex module uses global state:
string_match runs a regex and sets some global state.
matched_string returns the last matched string using the global state.
Lack of widely used operations such as
popcount for integer types, unicode character operations.
It doesn’t have proper string and character types:
String is a byte array,
char is a byte.
The bad state of OCaml’s standard library also causes fragmentation in the ecosystem with two competing alternatives: Core and Batteries.
OCaml doesn’t have a single-line comment syntax.
The expression syntax has just too many issues. It’s inconsistent in how it uses delimiters.
while end with
try don’t, even though the right-most non-terminal is the same in all of these productions:
expr ::= ... | while <expr> do <expr> done | for <value-name> = <expr> ( to | downto ) <expr> do <expr> done | let <let-binding> in <expr> | if <expr> then <expr> [ else <expr> ] | match <expr> with (| <pattern> [ when <expr> ] -> <expr>)+ | try <expr> with (| <pattern> [ when <expr> ] -> <expr>)+ ...
while, but no
continue. So you use exceptions with a
try inside the loop for
continue, and outside for
It also has lots of ambiguities, and some of these ambiguities are resolved in an unintuitive way. In addition to making OCaml difficult to parse correctly, this can actually cause incorrect reading of the code.
Most common example is probably nesting
match e0 with try e1 with p2 -> e2 | p1 -> | p3 -> e3
p3 -> e3 is a part of the
Another example is the sequencing syntax
<expr> ; <expr> and productions with
<expr> as the right-most symbol:
let test1 b = if b then print_string "1" else print_string "2"; print_string "3"
print_string "3" is not a part of the
if expression, so this function always prints “3”.
However, even though
match also has
<expr> as the right-most symbol, it has different precedence in comparison to semicolon:
let test2 b = match b with true -> print_string "1" | false -> print_string "2"; print_string "3" |
print_string "3" is a part of the
false -> ... branch.
Try to guess how these functions are parsed:
(* Is the last print part of `else` or not? *) let test3 b = if b then print_string "1" else let x = "2" in print_string x; print_string "3" (* Is this well-typed? *) let test4 b = if b then 1, 2 else 3, 4 (* Is the type of this `(int * int) array -> unit` or `int array -> unit * int`? *) let test5 a = a.(0) <- 1, 2 (* What if I replace `,` with `;`? Does this set the element 1 or 2? *) let test6 a = a.(0) <- 1; 2
When writing OCaml you have to keep these rules in mind.
It also has the “dangling else” problem:
(* Is `else` part of the inner `if` or the outer? *) if e1 then if e2 then e3 else e4
Finally, and I think this is probably the most strange thing about OCaml’s syntax and I’m not even sure what’s exactly happening here (I can’t find anything relevant in the language documentation), comments in OCaml are somehow tokenized and those tokens need to be terminated. They can be terminated inside another comment, or even outside. This is a bit difficult to explain but here’s a simple example:
(* " *) print_string "hi"
OCaml 5.0.0 rejects this program with this error:
File "./test.ml", line 2, characters 16-17: 2 | print_string "hi" ^ String literal begins here
From the error message it seems like the
" in the comment line actually starts a string literal, which is terminated in the first quote of
"hi". The closing double quote of
"hi" thus starts another string literal, which is not terminated.
However that doesn’t explain why this works:
(* " *) print_string "hi" (* " *) print_string "bye"
If my explanation of the previous version were correct this would fail with an unbound
hi variable, but it works and prints “bye”!
I’m not following developments in OCaml ecosystem too closely, but just two years ago it was common to use Makefiles to build OCaml projects. The language server barely worked on a project with less than 50 kloc. There was no standard way of doing compile-time metaprogramming and some projects even used the C preprocessor (cpp).
Some of these things probably improved in the meantime, but the overall package is still not good enough compared to the alternatives.
Almost all modern statically typed languages have closures, higher-order functions/methods, lazy streams, and combinators that run efficiently. Persistent/immutable data structures can be implemented even in C.
Also, OCaml has no tracking of side-effects (like in Haskell), and the language and the standard library have lots of features and functions with mutation, such as the array update syntax, mutable record fields,
Hashtbl, and the regex module.
The only thing that makes OCaml more “functional” than e.g. Dart, Java, or Rust is that it supports tail calls. While having tail calls is important for functional programming, I would happily give up on tail calls if that means not having the problems listed above.
Also keep in mind that when you mix imperative and functional styles tail calls become less important. For example, I don’t have to implement a stream
map function in Dart with a tail call to map the rest of the stream, I can just use a
In my opinion there is no reason to use OCaml in a new project in 2023. If you have a reason to think that OCaml is the best choice for a new project please let me know your use case, I’m genuinely curious.