osa1

Some more notes on OCaml modules

March 14, 2026 - Tagged as: en, ocaml, plt.

In a previous post we’ve tried to solve a real problem (described in another post) with OCaml modules.

This post is basically my notes while studying OCaml modules, by using them (rather than e.g. reading a formal description of them or implementing them¹).

My goal with this series of blog posts is to figure out whether we can have a similar mechanism in Fir. This feature should (1) solve the original problem (linked above) (2) optimize away modules in compile time (they should not exist in runtime and using them should not come with runtime costs).

The main syntax for defining modules:

Signatures are defined with sig ... end syntax and given name with module type Foo = ... syntax.
Module definitions are given with struct ... end syntax and given name with module Foo = ... syntax.

When not given a signature explicitly, signature of a module is inferred from the module definition. An .mli file defines the signature of its corresponding .ml file. Syntax of an .mli file is the body part of sig ... end, and the syntax of an .ml file is the body part of struct ... end.

Some interesting properties of modules:

Their types can always be inferred.
They’re structurally matched.
They’re not first-class by themselves, but they can be “packed” as values, allowing them to be used as first-class values.

Example:

module type Foo = sig
  type t
  val make_t : unit -> t
  val f : t -> unit
end

module Foo1 = struct
  type t = A | B
  let make_t () = A
  let f _ = ()
  let g _ = print_string "hi"
end

module Foo2 = struct
  type t = string
  let make_t () = "hi"
  let f _ = ()
end

Note that types of Foo1 and Foo2 are inferred. They have the members of Foo, but Foo1 has an extra member.

I can use both Foo1 and Foo2 as Foo, but I need to declare a new module for this with an explicit signature. Modules can be declared at the top-level but also in nested scopes:

module FooTest1 : Foo = Foo1
module FooTest2 : Foo = Foo2

let test1 x =
  (* LocalFoo1 here is `Foo`, so I can't use `g`. (the extra member) *)
  let module LocalFoo1 : Foo = Foo1 in
  LocalFoo1.f (LocalFoo1.make_t ());

  (* This doesn't change `Foo1`'s type, so `g` can be used. *)
  let module LocalFoo2 = Foo1 in
  LocalFoo2.g ()

Modules by themselves are not first-class values, but they can be packed as first-class values:

(* Inferred type: `unit -> (module Foo)` *)
let select_foo () =
  (* The parens are necessary here for this to parse. *)
  if Random.bool () then (module Foo1 : Foo) else (module Foo2 : Foo)

(* Inferred type: `(module Foo) -> unit` *)
let use_foo foo =
  (* Special syntax for unpacking modules. *)
  let module F = (val foo : Foo) in
  (* Inferred type: `F.t` *)
  let t = F.make_t () in
  F.f t

Packing comes with a runtime cost: first-class modules are similar to records with existentials, both in how they’re used, and in runtime.

As far as I understand, packing and unpacking of modules is the only feature that makes them first-class. Without them the code using modules can’t be polymorphic, so modules could completely disappear during compilation.

Reuse between signatures and modules: the problem we’ve discovered in the previous post was that there’s absolutely no reuse between a module signature and a module definition. Here’s another signature, this time with a concrete type definition:

module type Bar = sig
  type t = A | B
  val print_t : t -> unit
end

Even though t is concrete this time, any module that implements this signature needs to duplicate the definition of it.

We also can’t give print_t a default implementation as it would be useless, since there’s no way to reuse anything from a signature in a module.

module Bar1 : Bar = struct
  type t = A | B

  let print_t = function
    | A -> print_string "A\n"
    | B -> print_string "B\n"
end

Here we can’t omit the full definition of t. This is the main problem with OCaml modules that I’d like to solve.

include helps, but doesn’t solve the problem entirely: we can “include” a module in another, and a signature in another. As an example, let’s say we’ll reuse the same t above in a few modules and signatures. In signatures:

module type SigT = sig
  type t = A | B
end

module type SigInclude1 = sig
  include SigT
  val f : t -> unit
end

module type SigInclude2 = sig
  include SigT
  val g : t -> unit
end

In modules:

module StructT = struct
  type t = A | B
end

module StructInclude1 = struct
  include StructT
  let f = function
    | A -> ()
    | B -> ()
end

module StructInclude2 = struct
  include StructT
  let g = function
    | A -> ()
    | B -> ()
end

This helps reducing code duplication in modules and in signatures, but since there’s no mixing between these two, there still needs to be at least two definitions of the type, one in a sig and one in a struct.

Functors are for modules, not signatures: A functor is a function from a module to a module. There’s no equivalent for signatures.

Here’s a functor that adds a type and a function to a module:

module AddStuff (M : SigInclude1) = struct
  include M

  type t2 = C | D

  let g = function
    | A -> ()
    | B -> ()
end

Functors can’t be passed packed modules, we have to unpack and apply:

let use_add_stuff (packed : (module SigInclude1)) =
  let module M = (val packed : SigInclude1) in
  let module R = AddStuff(M) in
  R.f A;
  R.g B

They also can’t be packed themselves. So they can’t be made first class.

If I omit the include part in AddStuff, that effectively drops the argument module’s members from the result, as the returned module signature won’t include the members in the argument module.

(Another way to do the same would be to give a signature to the returned module, with some of the members in SigInclude1 missing.)

Structural type checking of modules is not important for me, at least for the problem I’m trying to solve. So I’m not looking into this in this blog post.

In a future post we’ll have a more formal treatment of OCaml modules.

If we remove packing modules, they become second class, and they can be completely optimized away in compile time.

The main question I still have is whether we really need the sig/struct distinction, and if we need, whether we could “inherit” (or “include”) type and term definitions from signatures to structs. I’m hoping to figure these out in a future post.

Which I’m also doing, but that will be the subject of another post.↩︎