<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>osa1.net - All posts</title>
    <link href="http://osa1.net/rss.xml" rel="self" />
    <link href="http://osa1.net" />
    <id>http://osa1.net/rss.xml</id>
    <author>
        <name>Ömer Sinan Ağacan</name>
        <email>omeragacan@gmail.com</email>
    </author>
    <updated>2026-04-15T00:00:00Z</updated>
    <entry>
    <title>Fir now compiles to C (+ extensible named types, associated types, modules, and more)</title>
    <link href="http://osa1.net/posts/2026-04-15-fir-devlog.html" />
    <id>http://osa1.net/posts/2026-04-15-fir-devlog.html</id>
    <published>2026-04-15T00:00:00Z</published>
    <updated>2026-04-15T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>One of my original goals with Fir was to bootstrap it as early as possible. I was so determined, I committed the first code for the self-hosted compiler in the <a href="https://github.com/fir-lang/fir/commit/a69e3cefcb42c1ad63e303e70dbd9e66d5aa512f">322nd commit</a>, on 11 April 2025, after less than a year of development in the open source<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>, when it was barely usable. To understand how early this is, we’re currently on commit 1,052, and in my opinion it only recently became somewhat usable.</p>
<p>Unsurprisingly, this turned out to be a challenge, and I had to accept the fact that a compiler for a non-trivial language is a lot of work. You really need a lot of language features + a good implementation (generating fast code) for it. It’s not that it cannot be done otherwise, but the process becomes extremely slow, tedious, and boring.</p>
<p>Something else that became evident as I worked on the self-hosted compiler was that, even if I finish it with just the features we have, I’ll have to refactor it quite significantly as we implement the planned features, to the point where it could feel more like a rewrite than a refactoring.</p>
<p>Finally, I thought (perhaps mistakenly), with some of the recent developments in programming tooling and software development methods (you know what I’m talking about), with a working reference implementation + tests, bootstrapping effort could be largely automated.</p>
<p>So I started implementing features that I consider essential for Fir 1.0, in the reference implementation. In this post we’ll look at some of these features that were recently implemented.</p>
<h2 id="fir-now-compiles-to-c">Fir now compiles to C</h2>
<p>The Fir reference implementation now compiles to C. The motivation for this development was that running the self-hosted compiler with the interpreter to compile itself was taking 8.8s, despite just parsing + name resolving. That’s already too long, and it was going to get much worse as we implement type checking, monomorphisation, and code generation.</p>
<p>I made a few attempts at optimizing the interpreter, but it became clear that with very little effort, compared to designing and implementing a bytecode interpreter, I could compile it to C<a href="#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a>. Because we already had a monomorphiser, compilation to C was mostly very straightforward, and we immediately got 12x speedup: the self-hosted compiler started checking itself in 0.7s instead of 8.8s.</p>
<p>When working on the compiler, compiling the compiler to C, then compiling that C to native with clang, then running the executable on the compiler itself is currently at 1.7s.</p>
<p>This also improved the workflow in other areas: formatting the whole code base and compiling PEG files take an instant now, instead of many seconds.</p>
<p>This also allowed other things that made this even better in terms of return-on-investment: we got free garbage collection with the <a href="https://github.com/bdwgc/bdwgc">Boehm-Demers-Weiser conservative GC</a>, and value types became trivial to implement. This will also make it easier to add C FFI in the future. (more on this below)</p>
<p>The interpreter still exists, mainly to keep <a href="https://fir-lang.github.io/">the online interpreter</a> running.</p>
<h2 id="value-types">Value types</h2>
<p>Fir literally started as a “high-level language with value types”, but it wasn’t entirely trivial to implement them until we had the C backend.</p>
<p>With the C backend, it became a matter of making it generate typed code (instead of treating all values as <code>uint64_t*</code> or similar), and then not <code>malloc</code>ing value types.</p>
<p>Here’s an example value type, from the standard library:</p>
<pre><code># Immutable, UTF-8 encoded strings.
value type Str(
    # UTF-8 encoding of the string.
    _bytes: Array[U8],
)</code></pre>
<p>Relevant struct definitions in generated C:</p>
<pre><code>typedef struct Array_U8 {
    U8* data_ptr;
    uint64_t len;
} Array_U8;

typedef struct Str {
    Array_U8 _0;
} Str;</code></pre>
<p>This type is then used directly (instead of as a pointer). Here’s a forward-declaration of a function from the self-hosted compiler:</p>
<pre><code>// Compiler/ParseUtils.fir:32:1 parseCharLit[U32]
static Char _fun_8(Str _p0);</code></pre>
<p>(The comment line here is generated by the compiler to make it easier to read the generated code.)</p>
<h2 id="associated-types">Associated types</h2>
<p>This was a feature that I delayed implementing for way too longer than I should’ve, mostly because I didn’t know how to implement them and it took a while to figure it out.</p>
<p>Associated types in Fir are the same feature as associated types in Rust. The most common use case for them is the <code>Iterator</code> trait. Before associated types, <code>Iterator</code> in Fir looked like this: (omitting extra methods with default implementations)</p>
<pre><code>trait Iterator[iter, item, exn]:
    next(self: iter) Option[item] / exn</code></pre>
<p>Here’s how the <code>CharIter</code>’s (iterates characters of a string) <code>Iterator</code> implementation looked like:</p>
<pre><code>impl Iterator[CharIter, Char, exn]:
    next(self: CharIter) Option[Char] / exn:</code></pre>
<p>This trait definition has a problem. The type of <code>Iterator.next</code> is this:</p>
<pre><code>[Iterator[iter, item, exn]] Fn(self: iter) Option[item] / exn</code></pre>
<p>Based on this type, in a call site like <code>charIter.next()</code> (where <code>charIter : CharIter</code>), we generate the predicate <code>Iterator[CharIter, item, exn]</code> and the type of the call expression becomes <code>Option[item]</code>. (where <code>item</code> and <code>exn</code> are fresh unification variables)</p>
<p>If the expected type of the call expressions is not precise enough to unify that <code>item</code> type with a concrete type, the predicate never becomes <code>Iterator[CharIter, Char, exn]</code>, and we can’t solve it, because there isn’t an <code>impl</code> for <code>Iterator[CharIter, item, exn]</code> (note: with generic <code>item</code>). We only have <code>Iterator[CharIter, Char, exn]</code>.</p>
<p>This resulted in lots of type annotations in the code that uses the <code>Iterator</code> trait. Most importantly, it required type annotations in <code>for</code> loops as <code>for</code> loops used <code>Iterator</code> under the hood. For example:</p>
<pre><code>for char: Char in charIter:
    print(char)</code></pre>
<p>Here <code>print</code> is a generic function that works on any <code>ToStr</code> type, so without the type annotation the predicate became too generic and couldn’t be solved.</p>
<p>With associated types, the trait now looks like this:</p>
<pre><code>trait Iterator[iter, exn]:
    type Item
    next(self: iter) Option[Item] / exn

impl Iterator[CharIter, exn]:
    type Item = Char
    next(self: CharIter) Option[Char] / exn:</code></pre>
<p>With this definition, the predicate for the same call becomes <code>Iterator[CharIter, exn]</code> (where <code>exn</code> is a fresh unification variable), and that’s immediately resolved using this <code>impl</code>. The <code>for</code> loop example above now works without a type annotation.</p>
<p>Associated types also allowed the next feature.</p>
<h2 id="its-now-possible-to-implement-traits-for-record-types">It’s now possible to implement traits for record types</h2>
<p>This was a small development in terms of code, but an important one for the language. Until this feature, we could pass records around and access fields in polymorphic contexts, but if we want to take a polymorphic record (with a row extension) and e.g. print it, there was no way.</p>
<p>This wasn’t too important until recently, as the main use case for records was returning multiple values. You’d then destruct/pattern match on the return values directly and use them individually. For example:</p>
<pre><code>divRem(x: U32, y: U32) (div: U32, rem: U32): ...

# Users just match on the fields instead of passing the return value around
# as a record.
let (div, rem) = divRem(a, b)</code></pre>
<p>However with the other developments listed below, records became much more useful, and not being able to implement traits on them became a problem.</p>
<p>The solution was porting PureScript’s <a href="https://pursuit.purescript.org/builtins/docs/Prim.RowList"><code>RowToList</code></a> typeclass to Fir. The idea is that we define a “magic” trait that converts record rows into heterogeneous lists:</p>
<pre><code>trait RecRowToList[recRow]:
    type List
    rowToList(rec: (..recRow)) Option[List]</code></pre>
<p>Here <code>recRow</code> is a record-row-kinded type parameter. This trait is resolved by the compiler for any valid (with right kind) type argument, and depending on the type argument the <code>List</code> type is also generated as an heterogeneous list. The heterogeneous list type is defined as this, in the standard library:</p>
<pre><code>value type List[head, tail](
    head: head,
    tail: Option[tail],
)</code></pre>
<p>In the generated <code>List</code> types for record rows, the <code>head</code> type is always a <code>RecordField</code>:</p>
<pre><code>value type RecordField[t](
    label: Str,
    value_: t,
)</code></pre>
<p>So for example, <code>RecRowToList[row(x: U32, msg: Str)]</code> is resolved by the type checker, and the <code>List</code> type is also resolved as <code>List[RecordField[Str], List[RecordField[U32], []]]</code><a href="#fn3" class="footnote-ref" id="fnref3" role="doc-noteref"><sup>3</sup></a> <a href="#fn4" class="footnote-ref" id="fnref4" role="doc-noteref"><sup>4</sup></a>.</p>
<p>Here’s how to implement <code>ToStr</code> on records using this machinery:</p>
<pre><code>impl[ToStr[RecRowToList[r].List]] ToStr[(..r)]:
    toStr(self: (..r)) Str:
        match RecRowToList[r].rowToList(self):
            Option.None: &quot;()&quot;
            Option.Some(list): &quot;(`list`)&quot;


impl[ToStr[t]] ToStr[RecordField[t]]:
    toStr(self: RecordField[t]) Str:
        &quot;`self.label` = `self.value_`&quot;


impl[ToStr[head], ToStr[tail]] ToStr[List[head, tail]]:
    toStr(self: List[head, tail]) Str:
        match self.tail:
            Option.None: &quot;`self.head`&quot;
            Option.Some(t): &quot;`self.head`, `t`&quot;

impl ToStr[[]]:
    toStr(self: []) Str:
        panic(&quot;unreachable&quot;)</code></pre>
<p>Note that the <code>List</code> and <code>RecordField</code> types are value types, so <code>rowToList</code> does not allocate. It just generates a different representation of the record on stack that we can recurse on.</p>
<h2 id="matching-a-bunch-of-fields-at-once-as-a-record">Matching a bunch of fields at once, as a record</h2>
<p>This was one of the very simple features that made records so much more useful.</p>
<p>When pattern matching fields, we can now use <code>..var</code> syntax to assign unmatched fields to a variable, as a record. Here’s a simple example:</p>
<pre><code>type Test(
    x: U32,
    y: U32,
    z: U32,
    msg: Str
)


main():
    let x = Test(x = 1, y = 2, z = 3, msg = &quot;hi&quot;)
    let Test(y, ..rest) = x
    print(rest)</code></pre>
<p>In the pattern, <code>y</code> matches the field <code>y</code>, <code>rest</code> matches the rest of the fields, as <code>(x: U32, z: U32, msg: Str)</code>. Then, using the <code>ToStr</code> implementation of records as shows above, this prints <code>(msg = "hi", x = 1, z = 3)</code>.</p>
<p>This is not the main use case for this feature, but just as a note, when combined with traits on records, this allows easily implementing traits by reusing records’ implementations of the traits. For example, <code>ToStr</code> for <code>Test</code> here can be implemented as:</p>
<pre><code>impl ToStr[Test]:
    toStr(self: Test) Str:
        let Test(..fields) = self
        &quot;Test`fields`&quot;</code></pre>
<p>With this implementation, the value <code>x</code> above now prints as <code>Test(msg = "hi", x = 1, y = 2, z = 3)</code>. This is the same output as the derived <code>ToStr</code> for this type, just with the different field order. (derived <code>impl</code> would print fields in the source code order, so: <code>x</code>, <code>y</code>, <code>z</code>, <code>msg</code>)</p>
<h2 id="splicing-records-and-named-arguments">Splicing records and named arguments</h2>
<p>We can now pass records as named arguments. The feature above copies field values to records, this one copies records to named arguments for fields.</p>
<p>This is also straightforward and I think a simple example should suffice, using the same types as above:</p>
<pre><code>main():
    let x = Test(x = 1, y = 2, z = 3, msg = &quot;hi&quot;)
    print(x)

    let Test(y, ..rest) = x     # rest: (x: U32, z: U32, msg: Str)
    let y = Test(y = 0, ..rest)
    print(y)

# output:
# Test(msg = hi, x = 1, y = 2, z = 3)
# Test(msg = hi, x = 1, y = 0, z = 3)</code></pre>
<p>Reminder: records (and variants) are value types. They’re not heap allocated. So the code above does not allocate for the <code>rest</code> record.</p>
<p>We can also make larger records from smaller ones with this feature:</p>
<pre><code>main():
    let x = (x = u32(123), y = u32(456))
    let y = (msg = &quot;hi&quot;, ..x)
    print(y)

# output: (msg = &quot;hi&quot;, x = 123, y = 456)</code></pre>
<p>Splicing two records together is currently not possible: there can be at most one <code>..expr</code> in a record expression.</p>
<h2 id="extensible-named-types">Extensible named types</h2>
<p>This is a big one that I talked about <a href="https://osa1.net/posts/2026-03-07-extensible-named-types-fir.html">in a previous post</a>. It only became usable after the record features above, associated types, and type synonyms.</p>
<p>For a running example, I added <a href="https://fir-lang.github.io/?file=NamedTypeExtensions.fir">a full program</a> to the online interpreter, showing a solution to the extensible AST types problem described in the blog post. It’s extensively documented, explaining all the interesting bits, so I recommend just checking it out.</p>
<p>In short, we allow extending named types using record rows. Pattern matching, allocation, and everything else works the same way as records. Here’s an example:</p>
<pre><code>type Foo[r](
    x: U32,
    y: U32,
    ..r
)


impl[r: Row[Rec], ToStr[RecRowToList[r].List]] ToStr[Foo[r]]:
    toStr(self: Foo[r]) Str:
        let Foo(..fields) = self
        &quot;Foo`fields`&quot;


main():
    let x = Foo(x = 1, y = 2, msg = &quot;hi&quot;)
    let y = Foo(b = Bool.True, y = 10, blah = Option.Some(u32(0)), x = 11)
    print(x)
    print(y)


# output:
# Foo(msg = hi, x = 1, y = 2)
# Foo(b = Bool.True, blah = Option.Some(0), x = 11, y = 10)</code></pre>
<p><code>Foo</code> here is an extensible type. In the allocation sites, we allocate it with different extra fields. The inferred types here are:</p>
<ul>
<li><code>x : Foo[row(msg: Str)]</code></li>
<li><code>y : Foo[row(b: Bool, blah: Option[U32])]</code></li>
</ul>
<p><code>ToStr</code> implementation is implemented using the record field matching features explained above, but it can also be derived.</p>
<p>This feature is used in the self-hosted compiler and the tools. The code is a bit long, but we basically use the same idea demonstrated in the online demo linked above, to add different fields to the AST nodes used by different tools. For example, here’s the AST node type for variant expressions, when compiled to C, as a part of the self-hosted compiler:</p>
<pre><code>typedef struct VariantExpr_CompilerAstExts {
    Expr_CompilerAstExts* _0;
    Option_Ty _1;
} VariantExpr_CompilerAstExts;</code></pre>
<p>And here’s the exact same type, but in the formatter’s compiled C code:</p>
<pre><code>typedef struct VariantExpr_DefaultAstExts {
    Expr_DefaultAstExts* _0;
} VariantExpr_DefaultAstExts;</code></pre>
<p>This is smaller because the formatter doesn’t have the extra field the compiler adds to the type.</p>
<p>Both are generated from this Fir type:</p>
<pre><code>type VariantExpr[exts](
    expr: Expr[exts],
    ..AstExts[exts].InferredTyExts
)</code></pre>
<p>You can see the full generic AST definitions used by the compiler and other tools <a href="https://github.com/fir-lang/fir/blob/96429adb83b2242ff806fe624dfe65be45e42b82/Compiler/Ast.fir">here</a>.</p>
<p>Because we can implement traits on records and record rows now, deriving traits also work on extensible types. In the example above, I can just add <code>#[derive(ToDoc)]</code> to <code>Foo</code> and then print it like this:</p>
<pre><code>#[derive(ToDoc)]
type Foo[r](
    x: U32,
    y: U32,
    ..r
)


main():
    let x = Foo(x = 1, y = 2, msg = &quot;hi&quot;)
    let y = Foo(b = Bool.True, y = 10, blah = Option.Some(u32(0)), x = 11)
    print(x.toDoc().render(80))
    print(y.toDoc().render(80))


# output:
# Foo(x = 1, y = 2, (msg = &quot;hi&quot;))
# Foo(x = 11, y = 10, (b = Bool.True, blah = Option.Some(0)))</code></pre>
<p>The AST types in the compiler all derive traits this way.</p>
<h2 id="modules">Modules</h2>
<p>Until recently, importing a module in Fir just parsed the module and copied the parsed code to the current module.</p>
<p>In other words, there was just one module. There were no name spaces, private definitions, selective imports, or importing with renaming.</p>
<p>It took quite a while to design and implement a proper module system and I actually found it quite difficult to design this, even though in the end the design was quite simple. There were two problems that made this difficult for me:</p>
<p>First, I wasn’t sure whether we want just namespacing (plus the usual features for selective imports, renaming, etc.) or something fancier, like first-class modules.</p>
<p>To figure this out I <a href="https://github.com/osa1/a-modular-module-system">studied OCaml’s module system</a> (and also <a href="https://osa1.net/posts/2026-03-10-containing-contagious-types.html">blogged about it</a>) and <a href="https://github.com/osa1/oneml">1ML</a> in a bit more detail, and decided that I want the modules to be type checking units (to be checked in parallel) and namespaces, instead of first-class values.</p>
<p>This significantly simplified the design, but the design space was still huge and there were just two constraints:</p>
<ul>
<li>They shouldn’t require separate files for interfaces and implementations.</li>
<li>Recursive imports should be allowed.</li>
</ul>
<p>So the second problem was that these requirements did not constrain the design space enough to give me a small number of options, with obvious and significant tradeoffs between them. I could probably come up with a dozen designs that would all be good enough.</p>
<p>In the end I had to make somewhat arbitrary decisions, based on what I needed in the past, from the other module systems that I used, and what I didn’t, and preference and taste. I updated one thing as I implemented it, and settled on this:</p>
<ul>
<li><p>Recursive imports are allowed, and there are no interface files. Each module is implemented as one file.</p></li>
<li><p>Module paths follow directory structure on the file system. E.g. an import to <code>Foo/Bar/Baz</code> requires the module to be in <code>Foo/Bar/Baz.fir</code> in the package root.</p></li>
<li><p>A module exports every non-underscored symbol that it has direct access to. This includes names that it imports. There’s no explicit exporting.</p></li>
<li><p>Underscored symbols are only accessible with explicit module paths. There’s nothing that’s truly private. If you really want you can access all private names. This keeps the design simple by avoiding fine-grained access control with things like <code>pub(crate)</code> or <code>pub(foo::bar::baz)</code> in Rust, and conditional compilation for exposing things for testing.</p></li>
<li><p>The usual renaming features are possible: modules can be imported with different names, individual definitions can be imported with different names.</p></li>
<li><p>Module path syntax is different than associated member access syntax: module paths use <code>/</code> as separator, associated members use <code>.</code>. For example:</p>
<ul>
<li><code>A/B</code> in expression context means “constructor B in module A”</li>
<li><code>B.C</code> in expression context means “constructor C of type B”</li>
<li><code>A/B.C</code> in expression context means “constructor C in type B in module A”</li>
</ul></li>
<li><p>This is currently not implemented: when a module exports something (type with constructors, function, …), everything referenced by the signature of the exported thing should also be exported.</p>
<p>This is to avoid the common issues in some languages where you export a function, but not the types that it uses, and the user either has to get it from another package or can’t use your function. Or even if the function is usable (for example, the private type is in the return type position and you just call the function but don’t use the return value), users can’t add type annotations to your function.</p>
<p>The principle here is that I should be able to take any expression in the program and give it a type annotation in a <code>let</code> statement. For trait methods, I should be able to explicitly call the methods with the type arguments. E.g. instead of <code>foo.toStr()</code> I should be able to do <code>ToStr.toStr[&lt;type of foo&gt;](foo)</code> so that means the trait type and all of the type arguments of the trait should be in scope and accessible.</p></li>
</ul>
<p>Here’s how relevant syntax looks currently:</p>
<pre><code># Each module can have at most one `import`. Documentation comments added to
# `import` lines become documentation comment of the module. When a module
# doesn&#39;t import anything an empty `import []` can be added to document the
# module.

## This is the module documentation.

import [
    # Import everything from `Fir/Prelude`, to use directly (without module
    # prefix).
    # This is implicitly added to every module already, so not needed. On here
    # for demonstration purposes.
    Fir/Prelude,

    # This allows using symbols imported from the module with the given prefix.
    # E.g. instead of `Option.Some(123)` we do `P/Option.Some(123)`.
    Fir/Prelude as P,

    # Only imports listed things.
    Fir/Prelude/[Option, Result, min, max],

    # Only imports listed things, but with renaming.
    Fir/Prelude/[Option, Result, min as _min, max as _max],
]


main():
    # Some random combination of imported things, used in different ways.
    print(Option.Some(_min(P/max(P/u32(0), u32(1)), u32(2))))


# output: Option.Some(1)</code></pre>
<p>Some other notes and clarifications on this design:</p>
<ul>
<li><p>Re-exporting imported things can be avoided by adding underscore to the imported names. E.g. in the code examples above, <code>_min</code> and <code>_max</code> are not exported from this module, but other non-underscored imports are.</p>
<p>This is not a special case for imports: underscored things are never exported. If you import something with an underscored name, it’s also not exported just like defined things.</p></li>
<li><p>Modules are only imported explicitly. There’s no re-exporting a module. So if the module above is <code>Foo/Bar</code>, you don’t get <code>Foo/Bar/P</code> when you import it.</p></li>
</ul>
<p>So far I’m happy with how it looks (syntax) and how it works, but as with most things in this language, it’s open to improvements, refinements, and even backwards incompatible changes.</p>
<h2 id="smaller-features-kind-annotations-and-type-synonyms">Smaller features: kind annotations and type synonyms</h2>
<p>These don’t need much introduction, but I want to document why they were needed and implemented.</p>
<p>Type synonyms came in handy in two places:</p>
<ul>
<li><p>With associated types, we want to refer to the associated types directly in the <code>trait</code> and <code>impl</code> bodies. For example, in the <code>Iterator</code> trait:</p>
<pre><code>trait Iterator[iter, exn]:
    type Item
    next(self: iter) Option[Item] / exn</code></pre>
<p>Normally the way you refer to <code>Item</code> here is with <code>Iterator[iter, exn].Item</code>. But within the <code>trait</code> body (and also in <code>impl</code>s), we want to refer to them as <code>Item</code> directly.</p></li>
<li><p>With extensible named types, we want to be able to define generic (extensible) types in a shared library, and the in the using libraries we want to override them (shadow the original definitions) with instantiated types. For example, the AST library defines <code>type VarExpr[exts](...)</code>. The formatter overrides it with the extension type it needs: <code>type VarExpr =   Ast/VarExpr[FormatterExts]</code>.</p></li>
</ul>
<p>The second one is obviously a type synonym, but the first one also uses the same underlying code. We just make type synonyms scoped, and create new synonyms in <code>trait</code> and <code>impl</code> bodies, for the associated types.</p>
<p>Kind annotations became necessary as we started using row-kinded type parameters more, for the extensible named types. Currently kind inference is very simple, it only looks at the current definition. If a type parameter is used in a row extension position (i.e. <code>..var</code>), its kind is inferred as <code>Row[Rec]</code> or <code>Row[Var]</code> depending on whether the extension is in a record (or fields) or variant (or constructors).</p>
<p>That means that in the extensible named type example above:</p>
<pre><code>type Foo[r](
    x: U32,
    y: U32,
    ..r
)</code></pre>
<p>Here <code>r</code>’s kind is inferred as <code>Row[Rec]</code>. But if we had another type that passed a generic <code>r</code> to it:</p>
<pre><code>type Bar[r](foo: Foo[r])</code></pre>
<p>This <code>r</code>’s kind was inferred as <code>*</code>, which is incorrect.</p>
<p>I don’t want to introduce module-level kind inference for various reasons, so I had to add kind annotations here. The correct definition with kind annotations is:</p>
<pre><code>type Bar[r: Row[Rec]](foo: Foo[r])</code></pre>
<p>Kinds follow the same syntax as types. <code>*</code>-kinded type parameters are just listed, without any annotations. This is useful to avoid reordering type parameters just to specify kinds of some of the types. E.g. if I have</p>
<pre><code>foo(x: t, y: Bar[r]): ...</code></pre>
<p>Here the inferred type parameters are <code>[t: *, r: *]</code>, generated from the signature by left-to-right scan. When calling we can explicitly pass them as <code>foo[type1, type2](...)</code>.</p>
<p>This passes wrong kinded type to <code>Bar</code>. To fix, we have to specify the kind of <code>r</code>:</p>
<pre><code>foo[r: Row[Rec]](x: t, y: Bar[r]): ...</code></pre>
<p>But this also reorders type parameters as <code>[r: Row[Rec], t: *]</code> now, as the type parameter lists are generated by a left-to-right scan of the signature.</p>
<p>To fix, we have to also list the type parameter <code>t</code> explicitly, just without a kind:</p>
<pre><code>foo[t, r: Row[Rec]](x: t, y: Bar[r]): ...</code></pre>
<p>This gives us the original order of the type parameters, but with the correct kinds: <code>[t: *, r: Row[Rec]]</code>.</p>
<h2 id="next-up-c-header-imports-c-ffi">Next up: C header imports (C FFI)</h2>
<p>This post is already too long so I want to keep this part short for now. With the (1) resources that I have (2) things I want to do with this language (3) what we have currently (current implementation), the shortest path to success (some kind of adoption) that I can see is by making C interop absolutely effortless.</p>
<p>By “effortless” I really mean it: I should be able to import a C header file in directly in Fir and just use the definitions and link the generated C with object files implementing the prototypes, and provide implementations for symbols used by other compiled C code.</p>
<p>Similar to the module system, this is an area I don’t have a lot of experience about. Depending on things that are our out of my control (i.e. life, responsibilities), and whether I’ll encounter fundamental issues, I suspect this will take 6-12 months to fully implement. Once done, Fir will be useful for many use cases.</p>
<section class="footnotes" role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><p>I started working on it earlier in 2024. Open sourced in June 2024.<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn2" role="doc-endnote"><p>This was partly thanks to the GCC extension <a href="https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html">statement expressions</a>, which allowed me to compile nested expressions directly to C without having to flatten them in an A-normal form IR or similar. The extension is also supported by clang so it didn’t make the generated C less portable.<a href="#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn3" role="doc-endnote"><p><code>[]</code> is the empty variant type, which doesn’t have any values.<a href="#fnref3" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn4" role="doc-endnote"><p>The generated list fields are sorted on field names, so <code>msg</code> comes before <code>x</code> here.<a href="#fnref4" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>]]></summary>
</entry>
<entry>
    <title>Exceptions as shared secrets, demonstrated</title>
    <link href="http://osa1.net/posts/2026-03-13-exceptions-as-shared-secrets.html" />
    <id>http://osa1.net/posts/2026-03-13-exceptions-as-shared-secrets.html</id>
    <published>2026-03-13T00:00:00Z</published>
    <updated>2026-03-13T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>Robert Harper’s <a href="https://existentialtype.wordpress.com/2012/12/03/exceptions-are-shared-secrets/">“Exceptions Are Shared Secrets”</a> is an intriguing blog post, but it may come as a bit abstract unless you’re already familiar with the idea of accidental exception (or more generally, effect) handling, as the post has no code.</p>
<p>In this post I want to give an example of the problems mentioned in the original post, and say a few words on how we might go about working around or fixing these issues.</p>
<p>The original post makes three assumptions about what an exception is and how it should be used:</p>
<ol type="1">
<li>An exception is just a way of passing a value from a “raiser” to a “handler”.</li>
<li>The raiser wants to limit who can intercept and handle the value (also called a “message”) being passed.</li>
<li>Who can intercept and handle an exception/message needs to be agreed upon via “dynamic classification”.</li>
</ol>
<p>My understanding of “dynamic classification” is that the cooperation between a raiser and handler doesn’t happen via static types (or any other static mechanism), but by agreeing upon some dynamic features of the values being passed, in runtime (e.g. identity of the object being raised).</p>
<p>I found it to be very difficult to come up with a real-world example of accidental exception handling causing a real bug, and I’m not interested in hypothetical issues that much. So for a long time I thought the issue is not that “real”. It was only by coincidence that I came across an example in a discussion on <a href="https://github.com/WebAssembly/stack-switching/discussions/27">stack switching</a> in WebAssembly. Here’s my Python rewrite of the original example demonstrating the issue: (full code in a few languages at the end of the post)</p>
<p>We’re implementing sequences that call a callback with the elements in the sequence:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true"></a><span class="co">## The base class for sequences.</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true"></a><span class="kw">class</span> Sequence:</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true"></a>        <span class="cf">raise</span> <span class="pp">NotImplementedError</span></span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true"></a></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true"></a></span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true"></a><span class="co">## Counts from a given integer up. Does not stop.</span></span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true"></a><span class="kw">class</span> CountFrom(Sequence):</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true"></a>    <span class="kw">def</span> <span class="fu">__init__</span>(<span class="va">self</span>, start: <span class="bu">int</span>):</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true"></a>        <span class="va">self</span>.start <span class="op">=</span> start</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true"></a></span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true"></a>        i <span class="op">=</span> <span class="va">self</span>.start</span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true"></a>        <span class="cf">while</span> <span class="va">True</span>:</span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true"></a>            consumer(i)</span>
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true"></a>            i <span class="op">+=</span> <span class="dv">1</span></span>
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true"></a></span>
<span id="cb1-18"><a href="#cb1-18" aria-hidden="true"></a></span>
<span id="cb1-19"><a href="#cb1-19" aria-hidden="true"></a><span class="co">## An empty sequence: does not call the callback.</span></span>
<span id="cb1-20"><a href="#cb1-20" aria-hidden="true"></a><span class="kw">class</span> Empty(Sequence):</span>
<span id="cb1-21"><a href="#cb1-21" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb1-22"><a href="#cb1-22" aria-hidden="true"></a>        <span class="cf">pass</span></span></code></pre></div>
<p>We want to implement a sequence that takes two sequences and an amount as arguments. It runs the first sequence the given number of times, and then runs the second sequence in full.</p>
<p>A problem here is that sequences don’t support stopping after a while, they always run until completion (or forever, as in <code>CountFrom</code>). So how do we stop the first sequence after the given number of times?</p>
<p>We throw an exception in the first sequence’s callback and catch it in the call site that runs the first sequence. Here’s the full <code>AppendAfter</code> that implements this idea:</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true"></a><span class="co">## The exception used to signal that the first sequence should be stopped, in</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true"></a><span class="co">## `AppendAfter`.</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true"></a><span class="kw">class</span> AppendAfterException(<span class="pp">Exception</span>):</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true"></a>    <span class="cf">pass</span></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true"></a></span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true"></a></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true"></a><span class="co">## Runs the first sequence `amount` times, then runs the second sequence.</span></span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true"></a><span class="kw">class</span> AppendAfter(Sequence):</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true"></a>    <span class="kw">def</span> <span class="fu">__init__</span>(<span class="va">self</span>, first: Sequence, amount: <span class="bu">int</span>, second: Sequence):</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true"></a>        <span class="va">self</span>.first <span class="op">=</span> first</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true"></a>        <span class="va">self</span>.amount <span class="op">=</span> amount</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true"></a>        <span class="va">self</span>.second <span class="op">=</span> second</span>
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true"></a></span>
<span id="cb2-14"><a href="#cb2-14" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb2-15"><a href="#cb2-15" aria-hidden="true"></a>        count <span class="op">=</span> <span class="va">self</span>.amount</span>
<span id="cb2-16"><a href="#cb2-16" aria-hidden="true"></a></span>
<span id="cb2-17"><a href="#cb2-17" aria-hidden="true"></a>        <span class="co"># The callback for the first sequence. Throws an exception after being</span></span>
<span id="cb2-18"><a href="#cb2-18" aria-hidden="true"></a>        <span class="co"># called `amount` times to stop iterating the first sequence.</span></span>
<span id="cb2-19"><a href="#cb2-19" aria-hidden="true"></a>        <span class="kw">def</span> limited_consumer(element):</span>
<span id="cb2-20"><a href="#cb2-20" aria-hidden="true"></a>            <span class="kw">nonlocal</span> count</span>
<span id="cb2-21"><a href="#cb2-21" aria-hidden="true"></a></span>
<span id="cb2-22"><a href="#cb2-22" aria-hidden="true"></a>            <span class="co"># Note: weird `count` update below is intentional.</span></span>
<span id="cb2-23"><a href="#cb2-23" aria-hidden="true"></a>            current <span class="op">=</span> count</span>
<span id="cb2-24"><a href="#cb2-24" aria-hidden="true"></a>            count <span class="op">-=</span> <span class="dv">1</span></span>
<span id="cb2-25"><a href="#cb2-25" aria-hidden="true"></a>            <span class="cf">if</span> current <span class="op">==</span> <span class="dv">0</span>:</span>
<span id="cb2-26"><a href="#cb2-26" aria-hidden="true"></a>                <span class="cf">raise</span> AppendAfterException()</span>
<span id="cb2-27"><a href="#cb2-27" aria-hidden="true"></a>            consumer(element)</span>
<span id="cb2-28"><a href="#cb2-28" aria-hidden="true"></a></span>
<span id="cb2-29"><a href="#cb2-29" aria-hidden="true"></a>        <span class="co"># Run the first sequence until the callback throws, signalling to stop</span></span>
<span id="cb2-30"><a href="#cb2-30" aria-hidden="true"></a>        <span class="co"># the first sequence.</span></span>
<span id="cb2-31"><a href="#cb2-31" aria-hidden="true"></a>        <span class="cf">try</span>:</span>
<span id="cb2-32"><a href="#cb2-32" aria-hidden="true"></a>            <span class="va">self</span>.first.for_each(limited_consumer)</span>
<span id="cb2-33"><a href="#cb2-33" aria-hidden="true"></a>        <span class="cf">except</span> AppendAfterException:</span>
<span id="cb2-34"><a href="#cb2-34" aria-hidden="true"></a>            <span class="cf">pass</span></span>
<span id="cb2-35"><a href="#cb2-35" aria-hidden="true"></a></span>
<span id="cb2-36"><a href="#cb2-36" aria-hidden="true"></a>        <span class="va">self</span>.second.for_each(consumer)</span></code></pre></div>
<p>Here’s an example of how this works:</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true"></a>AppendAfter(CountFrom(<span class="dv">0</span>), <span class="dv">5</span>, Empty()).for_each(<span class="bu">print</span>)</span></code></pre></div>
<p>This prints: 0, 1, 2, 3, 4. (each on a new line)</p>
<p>But the code also has a bug. Here’s another use of it that doesn’t work as expected:</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true"></a>AppendAfter(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true"></a>    AppendAfter(CountFrom(<span class="dv">0</span>), <span class="dv">10</span>, CountFrom(<span class="dv">20</span>)),</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true"></a>    <span class="dv">5</span>,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true"></a>    Empty()</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true"></a>).for_each(<span class="bu">print</span>)</span></code></pre></div>
<p>This counts to 4, then jumps to 20, and then loops infinitely.</p>
<p>Here’s the problem: the outer <code>AppendAfter</code> counts to 5 in the callback it passes to the inner <code>AppendAfter</code> and then throws an exception to stop iteration. The inner <code>AppendAfter</code> passes the same callback to its first sequence, while also counting. When the outer <code>AppendAfter</code>’s callback throws after 5 iterations, the exception is handled by the inner <code>AppendAfter</code>’s exception handler. So the outer <code>AppendAfter</code> never sees this exception, and it keeps running its first sequence.</p>
<p>The outer sequence never throws an exception again, because of the way we update the <code>count</code> local: we update it first and then check for its previous value. This looks strange in Python, but in a language with pre/post increments/decrements it looks more plausible:</p>
<pre><code>if (count-- == 0) {
  throw AppendAfterException();
}</code></pre>
<p>Once this exception is caught by a wrong handler, <code>count</code> never becomes 0 again, so the iteration never stops.</p>
<p>According to the original post, an exception should be a “shared secret” between a raiser and a handler, meaning no other handler (other than the intended one) should be able to intercept and decipher it.</p>
<p>I’m not aware of any language that allows this kind of exceptions<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>. To fix this in a way that somewhat resembles the exceptions explained in the original post, we need something unique shared between a raiser and a handler, so that the handler only catches the right exceptions and propagates the rest. In our demo, this is just a matter of creating the exception value ahead of time, in a scope shared between the raiser and handler, and then handling based on object identity. Here’s the fixed <code>AppendAfter</code>:</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true"></a><span class="kw">class</span> AppendAfter(Sequence):</span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true"></a>    ...</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true"></a></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true"></a>        count <span class="op">=</span> <span class="va">self</span>.amount</span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true"></a></span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true"></a>        <span class="co"># We create the exception value ahead of time. Both the raiser and</span></span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true"></a>        <span class="co"># handler have access to it.</span></span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true"></a>        sentinel <span class="op">=</span> AppendAfterException()</span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true"></a></span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true"></a>        <span class="kw">def</span> limited_consumer(element):</span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true"></a>            <span class="kw">nonlocal</span> count</span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true"></a>            current <span class="op">=</span> count</span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true"></a>            count <span class="op">-=</span> <span class="dv">1</span></span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true"></a>            <span class="cf">if</span> current <span class="op">==</span> <span class="dv">0</span>:</span>
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true"></a>                <span class="cf">raise</span> sentinel</span>
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true"></a>            consumer(element)</span>
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true"></a></span>
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true"></a>        <span class="cf">try</span>:</span>
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true"></a>            <span class="va">self</span>.first.for_each(limited_consumer)</span>
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true"></a>        <span class="cf">except</span> AppendAfterException <span class="im">as</span> e:</span>
<span id="cb6-22"><a href="#cb6-22" aria-hidden="true"></a>            <span class="cf">if</span> e <span class="kw">is</span> <span class="kw">not</span> sentinel:</span>
<span id="cb6-23"><a href="#cb6-23" aria-hidden="true"></a>                <span class="cf">raise</span></span>
<span id="cb6-24"><a href="#cb6-24" aria-hidden="true"></a></span>
<span id="cb6-25"><a href="#cb6-25" aria-hidden="true"></a>        <span class="va">self</span>.second.for_each(consumer)</span></code></pre></div>
<p>Full code:</p>
<details>
<p><summary>Python implementation with the bug</summary></p>
<div class="sourceCode" id="cb7"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true"></a><span class="im">from</span> collections.abc <span class="im">import</span> Callable</span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true"></a></span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true"></a></span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true"></a><span class="kw">class</span> AppendAfterException(<span class="pp">Exception</span>):</span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true"></a>    <span class="cf">pass</span></span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true"></a></span>
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true"></a></span>
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true"></a><span class="kw">class</span> Sequence:</span>
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true"></a>        <span class="cf">raise</span> <span class="pp">NotImplementedError</span></span>
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true"></a></span>
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true"></a></span>
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true"></a><span class="kw">class</span> CountFrom(Sequence):</span>
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true"></a>    <span class="kw">def</span> <span class="fu">__init__</span>(<span class="va">self</span>, start: <span class="bu">int</span>):</span>
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true"></a>        <span class="va">self</span>.start <span class="op">=</span> start</span>
<span id="cb7-16"><a href="#cb7-16" aria-hidden="true"></a></span>
<span id="cb7-17"><a href="#cb7-17" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb7-18"><a href="#cb7-18" aria-hidden="true"></a>        i <span class="op">=</span> <span class="va">self</span>.start</span>
<span id="cb7-19"><a href="#cb7-19" aria-hidden="true"></a>        <span class="cf">while</span> <span class="va">True</span>:</span>
<span id="cb7-20"><a href="#cb7-20" aria-hidden="true"></a>            consumer(i)</span>
<span id="cb7-21"><a href="#cb7-21" aria-hidden="true"></a>            i <span class="op">+=</span> <span class="dv">1</span></span>
<span id="cb7-22"><a href="#cb7-22" aria-hidden="true"></a></span>
<span id="cb7-23"><a href="#cb7-23" aria-hidden="true"></a></span>
<span id="cb7-24"><a href="#cb7-24" aria-hidden="true"></a><span class="kw">class</span> Empty(Sequence):</span>
<span id="cb7-25"><a href="#cb7-25" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb7-26"><a href="#cb7-26" aria-hidden="true"></a>        <span class="cf">pass</span></span>
<span id="cb7-27"><a href="#cb7-27" aria-hidden="true"></a></span>
<span id="cb7-28"><a href="#cb7-28" aria-hidden="true"></a></span>
<span id="cb7-29"><a href="#cb7-29" aria-hidden="true"></a><span class="kw">class</span> AppendAfter(Sequence):</span>
<span id="cb7-30"><a href="#cb7-30" aria-hidden="true"></a>    <span class="kw">def</span> <span class="fu">__init__</span>(<span class="va">self</span>, first: Sequence, amount: <span class="bu">int</span>, second: Sequence):</span>
<span id="cb7-31"><a href="#cb7-31" aria-hidden="true"></a>        <span class="va">self</span>.first <span class="op">=</span> first</span>
<span id="cb7-32"><a href="#cb7-32" aria-hidden="true"></a>        <span class="va">self</span>.amount <span class="op">=</span> amount</span>
<span id="cb7-33"><a href="#cb7-33" aria-hidden="true"></a>        <span class="va">self</span>.second <span class="op">=</span> second</span>
<span id="cb7-34"><a href="#cb7-34" aria-hidden="true"></a></span>
<span id="cb7-35"><a href="#cb7-35" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb7-36"><a href="#cb7-36" aria-hidden="true"></a>        count <span class="op">=</span> <span class="va">self</span>.amount</span>
<span id="cb7-37"><a href="#cb7-37" aria-hidden="true"></a></span>
<span id="cb7-38"><a href="#cb7-38" aria-hidden="true"></a>        <span class="kw">def</span> limited_consumer(element):</span>
<span id="cb7-39"><a href="#cb7-39" aria-hidden="true"></a>            <span class="kw">nonlocal</span> count</span>
<span id="cb7-40"><a href="#cb7-40" aria-hidden="true"></a>            <span class="co"># Note: if you change this to only decrement count when not</span></span>
<span id="cb7-41"><a href="#cb7-41" aria-hidden="true"></a>            <span class="co"># throwing, this works as expected.</span></span>
<span id="cb7-42"><a href="#cb7-42" aria-hidden="true"></a>            <span class="co">#</span></span>
<span id="cb7-43"><a href="#cb7-43" aria-hidden="true"></a>            <span class="co"># The point is, outer AppendAfter&#39;s exception is caught by the</span></span>
<span id="cb7-44"><a href="#cb7-44" aria-hidden="true"></a>            <span class="co"># inner AppendAfter, which then leaves inner AppendAfter in an</span></span>
<span id="cb7-45"><a href="#cb7-45" aria-hidden="true"></a>            <span class="co"># invalid state where count is negative.</span></span>
<span id="cb7-46"><a href="#cb7-46" aria-hidden="true"></a>            current <span class="op">=</span> count</span>
<span id="cb7-47"><a href="#cb7-47" aria-hidden="true"></a>            count <span class="op">-=</span> <span class="dv">1</span></span>
<span id="cb7-48"><a href="#cb7-48" aria-hidden="true"></a>            <span class="cf">if</span> current <span class="op">==</span> <span class="dv">0</span>:</span>
<span id="cb7-49"><a href="#cb7-49" aria-hidden="true"></a>                <span class="cf">raise</span> AppendAfterException()</span>
<span id="cb7-50"><a href="#cb7-50" aria-hidden="true"></a>            consumer(element)</span>
<span id="cb7-51"><a href="#cb7-51" aria-hidden="true"></a></span>
<span id="cb7-52"><a href="#cb7-52" aria-hidden="true"></a>        <span class="cf">try</span>:</span>
<span id="cb7-53"><a href="#cb7-53" aria-hidden="true"></a>            <span class="va">self</span>.first.for_each(limited_consumer)</span>
<span id="cb7-54"><a href="#cb7-54" aria-hidden="true"></a>        <span class="cf">except</span> AppendAfterException:</span>
<span id="cb7-55"><a href="#cb7-55" aria-hidden="true"></a>            <span class="cf">pass</span></span>
<span id="cb7-56"><a href="#cb7-56" aria-hidden="true"></a></span>
<span id="cb7-57"><a href="#cb7-57" aria-hidden="true"></a>        <span class="va">self</span>.second.for_each(consumer)</span>
<span id="cb7-58"><a href="#cb7-58" aria-hidden="true"></a></span>
<span id="cb7-59"><a href="#cb7-59" aria-hidden="true"></a></span>
<span id="cb7-60"><a href="#cb7-60" aria-hidden="true"></a><span class="cf">if</span> <span class="va">__name__</span> <span class="op">==</span> <span class="st">&quot;__main__&quot;</span>:</span>
<span id="cb7-61"><a href="#cb7-61" aria-hidden="true"></a>    <span class="co"># Works:</span></span>
<span id="cb7-62"><a href="#cb7-62" aria-hidden="true"></a>    AppendAfter(CountFrom(<span class="dv">0</span>), <span class="dv">5</span>, Empty()).for_each(<span class="bu">print</span>)</span>
<span id="cb7-63"><a href="#cb7-63" aria-hidden="true"></a></span>
<span id="cb7-64"><a href="#cb7-64" aria-hidden="true"></a>    <span class="co"># Loops:</span></span>
<span id="cb7-65"><a href="#cb7-65" aria-hidden="true"></a>    AppendAfter(</span>
<span id="cb7-66"><a href="#cb7-66" aria-hidden="true"></a>        AppendAfter(CountFrom(<span class="dv">0</span>), <span class="dv">10</span>, CountFrom(<span class="dv">20</span>)),</span>
<span id="cb7-67"><a href="#cb7-67" aria-hidden="true"></a>        <span class="dv">5</span>,</span>
<span id="cb7-68"><a href="#cb7-68" aria-hidden="true"></a>        Empty()</span>
<span id="cb7-69"><a href="#cb7-69" aria-hidden="true"></a>    ).for_each(<span class="bu">print</span>)</span></code></pre></div>
</details>
<details>
<p><summary>Python implementation with the bug fixed</summary></p>
<div class="sourceCode" id="cb8"><pre class="sourceCode python"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true"></a><span class="im">from</span> collections.abc <span class="im">import</span> Callable</span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true"></a></span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true"></a></span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true"></a><span class="kw">class</span> AppendAfterException(<span class="pp">Exception</span>):</span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true"></a>    <span class="cf">pass</span></span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true"></a></span>
<span id="cb8-7"><a href="#cb8-7" aria-hidden="true"></a></span>
<span id="cb8-8"><a href="#cb8-8" aria-hidden="true"></a><span class="kw">class</span> Sequence:</span>
<span id="cb8-9"><a href="#cb8-9" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb8-10"><a href="#cb8-10" aria-hidden="true"></a>        <span class="cf">raise</span> <span class="pp">NotImplementedError</span></span>
<span id="cb8-11"><a href="#cb8-11" aria-hidden="true"></a></span>
<span id="cb8-12"><a href="#cb8-12" aria-hidden="true"></a></span>
<span id="cb8-13"><a href="#cb8-13" aria-hidden="true"></a><span class="kw">class</span> CountFrom(Sequence):</span>
<span id="cb8-14"><a href="#cb8-14" aria-hidden="true"></a>    <span class="kw">def</span> <span class="fu">__init__</span>(<span class="va">self</span>, start: <span class="bu">int</span>):</span>
<span id="cb8-15"><a href="#cb8-15" aria-hidden="true"></a>        <span class="va">self</span>.start <span class="op">=</span> start</span>
<span id="cb8-16"><a href="#cb8-16" aria-hidden="true"></a></span>
<span id="cb8-17"><a href="#cb8-17" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb8-18"><a href="#cb8-18" aria-hidden="true"></a>        i <span class="op">=</span> <span class="va">self</span>.start</span>
<span id="cb8-19"><a href="#cb8-19" aria-hidden="true"></a>        <span class="cf">while</span> <span class="va">True</span>:</span>
<span id="cb8-20"><a href="#cb8-20" aria-hidden="true"></a>            consumer(i)</span>
<span id="cb8-21"><a href="#cb8-21" aria-hidden="true"></a>            i <span class="op">+=</span> <span class="dv">1</span></span>
<span id="cb8-22"><a href="#cb8-22" aria-hidden="true"></a></span>
<span id="cb8-23"><a href="#cb8-23" aria-hidden="true"></a></span>
<span id="cb8-24"><a href="#cb8-24" aria-hidden="true"></a><span class="kw">class</span> Empty(Sequence):</span>
<span id="cb8-25"><a href="#cb8-25" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb8-26"><a href="#cb8-26" aria-hidden="true"></a>        <span class="cf">pass</span></span>
<span id="cb8-27"><a href="#cb8-27" aria-hidden="true"></a></span>
<span id="cb8-28"><a href="#cb8-28" aria-hidden="true"></a></span>
<span id="cb8-29"><a href="#cb8-29" aria-hidden="true"></a><span class="kw">class</span> AppendAfter(Sequence):</span>
<span id="cb8-30"><a href="#cb8-30" aria-hidden="true"></a>    <span class="kw">def</span> <span class="fu">__init__</span>(<span class="va">self</span>, first: Sequence, amount: <span class="bu">int</span>, second: Sequence):</span>
<span id="cb8-31"><a href="#cb8-31" aria-hidden="true"></a>        <span class="va">self</span>.first <span class="op">=</span> first</span>
<span id="cb8-32"><a href="#cb8-32" aria-hidden="true"></a>        <span class="va">self</span>.amount <span class="op">=</span> amount</span>
<span id="cb8-33"><a href="#cb8-33" aria-hidden="true"></a>        <span class="va">self</span>.second <span class="op">=</span> second</span>
<span id="cb8-34"><a href="#cb8-34" aria-hidden="true"></a></span>
<span id="cb8-35"><a href="#cb8-35" aria-hidden="true"></a>    <span class="kw">def</span> for_each(<span class="va">self</span>, consumer: Callable) <span class="op">-&gt;</span> <span class="va">None</span>:</span>
<span id="cb8-36"><a href="#cb8-36" aria-hidden="true"></a>        count <span class="op">=</span> <span class="va">self</span>.amount</span>
<span id="cb8-37"><a href="#cb8-37" aria-hidden="true"></a>        sentinel <span class="op">=</span> AppendAfterException()</span>
<span id="cb8-38"><a href="#cb8-38" aria-hidden="true"></a></span>
<span id="cb8-39"><a href="#cb8-39" aria-hidden="true"></a>        <span class="kw">def</span> limited_consumer(element):</span>
<span id="cb8-40"><a href="#cb8-40" aria-hidden="true"></a>            <span class="kw">nonlocal</span> count</span>
<span id="cb8-41"><a href="#cb8-41" aria-hidden="true"></a>            current <span class="op">=</span> count</span>
<span id="cb8-42"><a href="#cb8-42" aria-hidden="true"></a>            count <span class="op">-=</span> <span class="dv">1</span></span>
<span id="cb8-43"><a href="#cb8-43" aria-hidden="true"></a>            <span class="cf">if</span> current <span class="op">==</span> <span class="dv">0</span>:</span>
<span id="cb8-44"><a href="#cb8-44" aria-hidden="true"></a>                <span class="cf">raise</span> sentinel</span>
<span id="cb8-45"><a href="#cb8-45" aria-hidden="true"></a>            consumer(element)</span>
<span id="cb8-46"><a href="#cb8-46" aria-hidden="true"></a></span>
<span id="cb8-47"><a href="#cb8-47" aria-hidden="true"></a>        <span class="cf">try</span>:</span>
<span id="cb8-48"><a href="#cb8-48" aria-hidden="true"></a>            <span class="va">self</span>.first.for_each(limited_consumer)</span>
<span id="cb8-49"><a href="#cb8-49" aria-hidden="true"></a>        <span class="cf">except</span> AppendAfterException <span class="im">as</span> e:</span>
<span id="cb8-50"><a href="#cb8-50" aria-hidden="true"></a>            <span class="cf">if</span> e <span class="kw">is</span> <span class="kw">not</span> sentinel:</span>
<span id="cb8-51"><a href="#cb8-51" aria-hidden="true"></a>                <span class="cf">raise</span></span>
<span id="cb8-52"><a href="#cb8-52" aria-hidden="true"></a></span>
<span id="cb8-53"><a href="#cb8-53" aria-hidden="true"></a>        <span class="va">self</span>.second.for_each(consumer)</span>
<span id="cb8-54"><a href="#cb8-54" aria-hidden="true"></a></span>
<span id="cb8-55"><a href="#cb8-55" aria-hidden="true"></a></span>
<span id="cb8-56"><a href="#cb8-56" aria-hidden="true"></a><span class="cf">if</span> <span class="va">__name__</span> <span class="op">==</span> <span class="st">&quot;__main__&quot;</span>:</span>
<span id="cb8-57"><a href="#cb8-57" aria-hidden="true"></a>    <span class="co"># Works:</span></span>
<span id="cb8-58"><a href="#cb8-58" aria-hidden="true"></a>    AppendAfter(CountFrom(<span class="dv">0</span>), <span class="dv">5</span>, Empty()).for_each(<span class="bu">print</span>)</span>
<span id="cb8-59"><a href="#cb8-59" aria-hidden="true"></a></span>
<span id="cb8-60"><a href="#cb8-60" aria-hidden="true"></a>    <span class="co"># Also works now:</span></span>
<span id="cb8-61"><a href="#cb8-61" aria-hidden="true"></a>    AppendAfter(</span>
<span id="cb8-62"><a href="#cb8-62" aria-hidden="true"></a>        AppendAfter(CountFrom(<span class="dv">0</span>), <span class="dv">10</span>, CountFrom(<span class="dv">20</span>)),</span>
<span id="cb8-63"><a href="#cb8-63" aria-hidden="true"></a>        <span class="dv">5</span>,</span>
<span id="cb8-64"><a href="#cb8-64" aria-hidden="true"></a>        Empty()</span>
<span id="cb8-65"><a href="#cb8-65" aria-hidden="true"></a>    ).for_each(<span class="bu">print</span>)</span></code></pre></div>
</details>
<p>If you want to experiment with this in other languages:</p>
<details>
<p><summary>Dart implementation</summary></p>
<pre class="dart"><code>abstract class Sequence&lt;Element&gt; {
  void forEach(void Function(Element) consumer);
}

class CountFrom implements Sequence&lt;int&gt; {
  final int from;

  CountFrom(this.from);

  @override
  void forEach(void Function(int) consumer) {
    for (int i = from; ; i += 1) {
      consumer(i);
    }
  }
}

class Empty implements Sequence&lt;int&gt; {
  @override
  void forEach(void Function(int) consumer) {}
}

class AppendAfter&lt;Element&gt; implements Sequence&lt;Element&gt; {
  final Sequence&lt;Element&gt; first;
  final Sequence&lt;Element&gt; second;
  final int amount;

  AppendAfter(this.first, this.amount, this.second);

  @override
  void forEach(void Function(Element) consumer) {
    try {
      int count = amount;
      first.forEach((element) {
        if (count-- == 0) {
          throw AppendAfterException();
        }
        consumer(element);
      });
    } on AppendAfterException {}
    second.forEach(consumer);
  }
}

class AppendAfterException {}

void main() {
  // final simple = AppendAfter(CountFrom(0), 5, Empty());
  // simple.forEach((i) =&gt; print(i));

  final complex = AppendAfter(AppendAfter(CountFrom(0), 10, CountFrom(20)), 5, Empty());
  complex.forEach((i) =&gt; print(i));
}</code></pre>
</details>
<details>
<p><summary>Fir implementation</summary></p>
<pre><code>trait Sequence[seq, t, exn]:
    forEach(self: seq, consumer: Fn(t) / exn) / exn

# ------------------------------------------------------------------------------

type CountFrom(from: U32)

impl Sequence[CountFrom, U32, exn]:
    forEach(self: CountFrom, consumer: Fn(U32) / exn) / exn:
        let i = self.from
        loop:
            consumer(i)
            i += 1

# ------------------------------------------------------------------------------

type AppendAfter[s1, s2](
    seq1: s1,
    seq2: s2,
    amt: U32,
)

type AppendAfterStop:
    AppendAfterStop

impl[Sequence[s1, t, [AppendAfterStop, ..exn]], Sequence[s2, t, [AppendAfterStop, ..exn]]]
        Sequence[AppendAfter[s1, s2], t, [AppendAfterStop, ..exn]]:
    forEach(
            self: AppendAfter[s1, s2],
            consumer: Fn(t) / [AppendAfterStop, ..exn]
        ) / [AppendAfterStop, ..exn]:
        match try(\():
            self.seq1.forEach(\(i: t) / [AppendAfterStop, ..exn]:
                let current = self.amt
                self.amt -= 1
                if current == 0:
                    throw(~AppendAfterStop.AppendAfterStop)
                consumer(i))):
            Result.Ok(()) | Result.Err(~AppendAfterStop.AppendAfterStop):
                self.seq2.forEach(consumer)

# ------------------------------------------------------------------------------

type EmptySeq:
    EmptySeq

impl Sequence[EmptySeq, t, exn]:
    forEach(self: EmptySeq, consumer: Fn(t) / exn) / exn:
        ()

# ------------------------------------------------------------------------------

main():
    let seq =
        AppendAfter(
            seq1 = AppendAfter(seq1 = CountFrom(from = 0), seq2 = CountFrom(from = 10), amt = 5),
            seq2 = EmptySeq.EmptySeq,
            amt = 5,
        )

    try[(), [AppendAfterStop], []](
        \(): seq.forEach(\(i: U32): print(i)))

    ()</code></pre>
</details>
<p>Fir implementation demonstrates that the issue is not a typing issue: it happens even with checked exceptions.</p>
<p>Note that in debug builds this Fir program will crash because of an underflow: the counter goes below 0 as explained above, but it’s not allowed to, as the counter type is unsigned. If you want it to loop, run in release mode.</p>
<section class="footnotes" role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><p>I’ve briefly looked into how exceptions work in SML as the original post mentions it a few times. In SML you can catch all exceptions, so you can intercept anything and it doesn’t fully implement Robert’s ideal exception semantics.<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>]]></summary>
</entry>
<entry>
    <title>Containing contagious types with OCaml modules</title>
    <link href="http://osa1.net/posts/2026-03-10-containing-contagious-types.html" />
    <id>http://osa1.net/posts/2026-03-10-containing-contagious-types.html</id>
    <published>2026-03-10T00:00:00Z</published>
    <updated>2026-03-10T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>In the <a href="https://osa1.net/posts/2026-03-07-extensible-named-types-fir.html">previous post</a> we looked at a way to extend product types with new fields and sum types with new constructors, using row types, in Fir.</p>
<p>A problem with the approach was that it required adding type parameters to the type being extended. In the cases where the extended type is a sum type and different constructors are extended with different fields, we may even need more than one type parameter. Those type parameters can then be propagated to the use sites, and their use sites, and their use sites…</p>
<p>I call these kinds of type parameters “contagious”, and it’s difficult to completely avoid them in Fir. In Fir, most function types are polymorphic in the exceptions they throw. This allows things like: calling a function that doesn’t throw in throwing contexts, or calling a function that throws <code>Error1</code> and another that throws <code>Error2</code> from the same function, and inferring the calling function’s exception type as <code>[Error1, Error2, ..exn]</code>. The way we achieve this polymorphism<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a> is by having a type parameter representing the exceptions the function can throw<a href="#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a>.</p>
<p>So I thought, maybe instead of avoiding type parameters, we should think about how we might contain, or hide them, and I started to look at existing features in other languages.</p>
<p>In this post we’re going to look at how OCaml modules might be used for avoiding multiple type parameters (one for each extension). It turns out OCaml modules provide a solution that’s <strong>almost</strong> right.</p>
<p>(Full OCaml code is at the end of this post.)</p>
<h1 id="the-setup">The setup</h1>
<p>We have lots of AST types for expressions, statements, declarations, … and we want to make them extensible with new fields and new constructors. Different AST types will be extended with different fields or constructors, and even in the same AST type (e.g. <code>Expr</code> in the original post) we may need different types of extensions for different constructors of the type.</p>
<p>To keep things simple, in this post we’ll only add new fields.</p>
<p>As the language, we’ll use the lambda calculus, with <code>let</code>s. Here’s how the AST could look like in OCaml:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true"></a><span class="kw">type</span> expr = Var <span class="kw">of</span> var | App <span class="kw">of</span> app | Abs <span class="kw">of</span> <span class="dt">abs</span> | Let <span class="kw">of</span> let_</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true"></a><span class="kw">and</span> var = { name : <span class="dt">string</span> }</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true"></a><span class="kw">and</span> app = { fn : expr; arg : expr }</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true"></a><span class="kw">and</span> <span class="dt">abs</span> = { param : <span class="dt">string</span>; body : expr }</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true"></a><span class="kw">and</span> let_ = { bound : <span class="dt">string</span>; rhs : expr; body : expr }</span></code></pre></div>
<p>With extensions:</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true"></a><span class="kw">type</span> (&#39;v, &#39;a, &#39;b, &#39;l) expr =</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true"></a>  | Var <span class="kw">of</span> &#39;v var</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true"></a>  | App <span class="kw">of</span> (&#39;v, &#39;a, &#39;b, &#39;l) app</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true"></a>  | Abs <span class="kw">of</span> (&#39;v, &#39;a, &#39;b, &#39;l) <span class="dt">abs</span></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true"></a>  | Let <span class="kw">of</span> (&#39;v, &#39;a, &#39;b, &#39;l) let_</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true"></a></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true"></a><span class="kw">and</span> &#39;v var = { name : <span class="dt">string</span>; var_ext : &#39;v }</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true"></a></span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true"></a><span class="kw">and</span> (&#39;v, &#39;a, &#39;b, &#39;l) app = {</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true"></a>  fn : (&#39;v, &#39;a, &#39;b, &#39;l) expr;</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true"></a>  arg : (&#39;v, &#39;a, &#39;b, &#39;l) expr;</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true"></a>  app_ext : &#39;a;</span>
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true"></a>}</span>
<span id="cb2-14"><a href="#cb2-14" aria-hidden="true"></a></span>
<span id="cb2-15"><a href="#cb2-15" aria-hidden="true"></a><span class="kw">and</span> (&#39;v, &#39;a, &#39;b, &#39;l) <span class="dt">abs</span> = {</span>
<span id="cb2-16"><a href="#cb2-16" aria-hidden="true"></a>  param : <span class="dt">string</span>;</span>
<span id="cb2-17"><a href="#cb2-17" aria-hidden="true"></a>  body : (&#39;v, &#39;a, &#39;b, &#39;l) expr;</span>
<span id="cb2-18"><a href="#cb2-18" aria-hidden="true"></a>  abs_ext : &#39;b;</span>
<span id="cb2-19"><a href="#cb2-19" aria-hidden="true"></a>}</span>
<span id="cb2-20"><a href="#cb2-20" aria-hidden="true"></a></span>
<span id="cb2-21"><a href="#cb2-21" aria-hidden="true"></a><span class="kw">and</span> (&#39;v, &#39;a, &#39;b, &#39;l) let_ = {</span>
<span id="cb2-22"><a href="#cb2-22" aria-hidden="true"></a>  bound : <span class="dt">string</span>;</span>
<span id="cb2-23"><a href="#cb2-23" aria-hidden="true"></a>  rhs : (&#39;v, &#39;a, &#39;b, &#39;l) expr;</span>
<span id="cb2-24"><a href="#cb2-24" aria-hidden="true"></a>  body : (&#39;v, &#39;a, &#39;b, &#39;l) expr;</span>
<span id="cb2-25"><a href="#cb2-25" aria-hidden="true"></a>  let_ext : &#39;l;</span>
<span id="cb2-26"><a href="#cb2-26" aria-hidden="true"></a>}</span></code></pre></div>
<p>This is obviously unusable and it won’t scale with more types and constructors.</p>
<p>With modules, we can have a module signature with the AST types and abstract extension types, and implement it with different concrete types for the extension types.</p>
<p>We first define a module signature with the AST extensions:</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true"></a><span class="kw">module</span> <span class="kw">type</span> AST_EXTENSIONS = <span class="kw">sig</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true"></a>  <span class="kw">type</span> var_ext</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true"></a>  <span class="kw">type</span> app_ext</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true"></a>  <span class="kw">type</span> let_ext</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true"></a></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true"></a>  <span class="kw">val</span> default_var_ext : var_ext</span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true"></a>  <span class="kw">val</span> default_app_ext : app_ext</span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true"></a>  <span class="kw">val</span> default_abs_ext : abs_ext</span>
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true"></a>  <span class="kw">val</span> default_let_ext : let_ext</span>
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true"></a><span class="kw">end</span></span></code></pre></div>
<p>AST module signature then uses the extension types:</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true"></a><span class="kw">module</span> <span class="kw">type</span> AST = <span class="kw">sig</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true"></a>  <span class="kw">include</span> AST_EXTENSIONS</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true"></a></span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true"></a>  <span class="kw">type</span> expr = Var <span class="kw">of</span> var | App <span class="kw">of</span> app | Abs <span class="kw">of</span> <span class="dt">abs</span> | Let <span class="kw">of</span> let_</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true"></a>  <span class="kw">and</span> var = { name : <span class="dt">string</span>; var_ext : var_ext }</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true"></a>  <span class="kw">and</span> app = { fn : expr; arg : expr; app_ext : app_ext }</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true"></a>  <span class="kw">and</span> <span class="dt">abs</span> = { param : <span class="dt">string</span>; body : expr; abs_ext : abs_ext }</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true"></a>  <span class="kw">and</span> let_ = { bound : <span class="dt">string</span>; rhs : expr; body : expr; let_ext : let_ext }</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true"></a><span class="kw">end</span></span></code></pre></div>
<p>We then use a functor to create new <code>AST</code> modules, with a given extension module:</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true"></a><span class="kw">module</span> MakeAst (Ext : AST_EXTENSIONS) :</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true"></a>  AST</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true"></a>    <span class="kw">with</span> <span class="kw">type</span> var_ext = Ext.var_ext</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> app_ext = Ext.app_ext</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> abs_ext = Ext.abs_ext</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> let_ext = Ext.let_ext = <span class="kw">struct</span></span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true"></a>  <span class="kw">type</span> var_ext = Ext.var_ext</span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true"></a>  <span class="kw">type</span> app_ext = Ext.app_ext</span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext = Ext.abs_ext</span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true"></a>  <span class="kw">type</span> let_ext = Ext.let_ext</span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true"></a></span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true"></a>  <span class="kw">let</span> default_var_ext = Ext.default_var_ext</span>
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true"></a>  <span class="kw">let</span> default_app_ext = Ext.default_app_ext</span>
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true"></a>  <span class="kw">let</span> default_abs_ext = Ext.default_abs_ext</span>
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true"></a>  <span class="kw">let</span> default_let_ext = Ext.default_let_ext</span>
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true"></a></span>
<span id="cb5-17"><a href="#cb5-17" aria-hidden="true"></a>  <span class="kw">type</span> expr = Var <span class="kw">of</span> var | App <span class="kw">of</span> app | Abs <span class="kw">of</span> <span class="dt">abs</span> | Let <span class="kw">of</span> let_</span>
<span id="cb5-18"><a href="#cb5-18" aria-hidden="true"></a>  <span class="kw">and</span> var = { name : <span class="dt">string</span>; var_ext : Ext.var_ext }</span>
<span id="cb5-19"><a href="#cb5-19" aria-hidden="true"></a>  <span class="kw">and</span> app = { fn : expr; arg : expr; app_ext : app_ext }</span>
<span id="cb5-20"><a href="#cb5-20" aria-hidden="true"></a>  <span class="kw">and</span> <span class="dt">abs</span> = { param : <span class="dt">string</span>; body : expr; abs_ext : abs_ext }</span>
<span id="cb5-21"><a href="#cb5-21" aria-hidden="true"></a>  <span class="kw">and</span> let_ = { bound : <span class="dt">string</span>; rhs : expr; body : expr; let_ext : let_ext }</span>
<span id="cb5-22"><a href="#cb5-22" aria-hidden="true"></a><span class="kw">end</span></span></code></pre></div>
<p>In the first post we had two examples: a formatter that doesn’t need any extensions, and a type checker that needs to annotate AST nodes with inferred types. Here are the formatter’s and type checker’s AST modules:</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true"></a><span class="kw">module</span> FmtAst = MakeAst (<span class="kw">struct</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true"></a>  <span class="kw">type</span> var_ext = <span class="dt">unit</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true"></a>  <span class="kw">type</span> app_ext = <span class="dt">unit</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext = <span class="dt">unit</span></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true"></a>  <span class="kw">type</span> let_ext = <span class="dt">unit</span></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true"></a></span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true"></a>  <span class="kw">let</span> default_var_ext = ()</span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true"></a>  <span class="kw">let</span> default_app_ext = ()</span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true"></a>  <span class="kw">let</span> default_abs_ext = ()</span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true"></a>  <span class="kw">let</span> default_let_ext = ()</span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true"></a><span class="kw">end</span>)</span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true"></a></span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true"></a><span class="co">(* The type-checking type does not matter, just as a placeholder. *)</span></span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true"></a><span class="kw">type</span> ty = TyVar <span class="kw">of</span> <span class="dt">string</span> | TyArrow <span class="kw">of</span> ty * ty</span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true"></a></span>
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true"></a><span class="co">(* Type-checking AST extensions. *)</span></span>
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true"></a><span class="kw">type</span> tc_var_ext = { inferred_type : ty <span class="dt">option</span> }</span>
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true"></a><span class="kw">type</span> tc_app_ext = { result_type : ty <span class="dt">option</span> }</span>
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true"></a><span class="kw">type</span> tc_abs_ext = { param_type : ty <span class="dt">option</span> }</span>
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true"></a><span class="kw">type</span> tc_let_ext = { bound_type : ty <span class="dt">option</span> }</span>
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true"></a></span>
<span id="cb6-22"><a href="#cb6-22" aria-hidden="true"></a><span class="kw">module</span> TcAst = MakeAst (<span class="kw">struct</span></span>
<span id="cb6-23"><a href="#cb6-23" aria-hidden="true"></a>  <span class="kw">type</span> var_ext = tc_var_ext</span>
<span id="cb6-24"><a href="#cb6-24" aria-hidden="true"></a>  <span class="kw">type</span> app_ext = tc_app_ext</span>
<span id="cb6-25"><a href="#cb6-25" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext = tc_abs_ext</span>
<span id="cb6-26"><a href="#cb6-26" aria-hidden="true"></a>  <span class="kw">type</span> let_ext = tc_let_ext</span>
<span id="cb6-27"><a href="#cb6-27" aria-hidden="true"></a></span>
<span id="cb6-28"><a href="#cb6-28" aria-hidden="true"></a>  <span class="kw">let</span> default_var_ext = { inferred_type = <span class="dt">None</span> }</span>
<span id="cb6-29"><a href="#cb6-29" aria-hidden="true"></a>  <span class="kw">let</span> default_app_ext = { result_type = <span class="dt">None</span> }</span>
<span id="cb6-30"><a href="#cb6-30" aria-hidden="true"></a>  <span class="kw">let</span> default_abs_ext = { param_type = <span class="dt">None</span> }</span>
<span id="cb6-31"><a href="#cb6-31" aria-hidden="true"></a>  <span class="kw">let</span> default_let_ext = { bound_type = <span class="dt">None</span> }</span>
<span id="cb6-32"><a href="#cb6-32" aria-hidden="true"></a><span class="kw">end</span>)</span></code></pre></div>
<p>Now, the parser needs to be able to allocate different ASTs in different use sites, and so that’s where we need one type parameter (actually, a module parameter). As far as I understand, we can’t have functions parametric over modules, so we need a functor for generating a given module’s AST in the parser, using the <code>default_..._ext</code> functions in the AST module:</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true"></a><span class="kw">module</span> Parse (A : AST) = <span class="kw">struct</span></span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true"></a>  <span class="co">(* Parsing entry point: tokenizes and parses. *)</span></span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true"></a>  <span class="kw">let</span> parse (<span class="dt">input</span> : <span class="dt">string</span>) : A.expr =</span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true"></a>    ...</span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true"></a></span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true"></a>  <span class="co">(* Parse a single expression from tokens. *)</span></span>
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true"></a>  <span class="kw">let</span> <span class="kw">rec</span> parse_expr (toks : tokens) : A.expr * tokens =</span>
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true"></a>    ...</span>
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true"></a><span class="kw">end</span></span></code></pre></div>
<p>Similarly, any other function that’s polymorphic over AST types needs to be a part of a functor that takes an AST module as argument. As another example, here’s a function that counts the number of AST nodes:</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true"></a><span class="kw">module</span> CountNodes (A : AST) = <span class="kw">struct</span></span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true"></a>  <span class="kw">let</span> <span class="kw">rec</span> count (e : A.expr) : <span class="dt">int</span> =</span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true"></a>    <span class="kw">match</span> e <span class="kw">with</span></span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true"></a>    | Var _ -&gt; <span class="dv">1</span></span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true"></a>    | App { fn; arg; _ } -&gt; <span class="dv">1</span> + count fn + count arg</span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true"></a>    | Abs { body; _ } -&gt; <span class="dv">1</span> + count body</span>
<span id="cb8-7"><a href="#cb8-7" aria-hidden="true"></a>    | Let { rhs; body; _ } -&gt; <span class="dv">1</span> + count rhs + count body</span>
<span id="cb8-8"><a href="#cb8-8" aria-hidden="true"></a><span class="kw">end</span></span></code></pre></div>
<p>The final part of the ceremony is we apply these functors to get modules that we can then use to parse, format, and count nodes:</p>
<div class="sourceCode" id="cb9"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true"></a><span class="co">(* Parser module for the formatter. *)</span></span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true"></a><span class="kw">module</span> FmtParse = Parse (FmtAst)</span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true"></a></span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true"></a><span class="co">(* Parser module for the type checker. *)</span></span>
<span id="cb9-5"><a href="#cb9-5" aria-hidden="true"></a><span class="kw">module</span> TcParse = Parse (TcAst)</span>
<span id="cb9-6"><a href="#cb9-6" aria-hidden="true"></a></span>
<span id="cb9-7"><a href="#cb9-7" aria-hidden="true"></a><span class="co">(* Node counter on the formatter&#39;s AST. *)</span></span>
<span id="cb9-8"><a href="#cb9-8" aria-hidden="true"></a><span class="kw">module</span> CountFmt = CountNodes (FmtAst)</span>
<span id="cb9-9"><a href="#cb9-9" aria-hidden="true"></a></span>
<span id="cb9-10"><a href="#cb9-10" aria-hidden="true"></a><span class="co">(* Node counter on the type checker&#39;s AST. *)</span></span>
<span id="cb9-11"><a href="#cb9-11" aria-hidden="true"></a><span class="kw">module</span> CountTc = CountNodes (TcAst)</span></code></pre></div>
<p>Type checker and formatter then refer to these modules:</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true"></a><span class="kw">let</span> <span class="kw">rec</span> check_expr (e : TcAst.expr) : ty = ...</span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true"></a><span class="kw">let</span> <span class="kw">rec</span> format_expr (e : FmtAst.expr) : <span class="dt">string</span> = ...</span></code></pre></div>
<h1 id="the-good">The good</h1>
<p>I can easily add per-AST functions, constants, or types and my parser or type checker code doesn’t become any worse. They always refer to the AST-specific things directly, and type signatures within the parser and type checker modules don’t get more complicated as we add more extensions.</p>
<h1 id="the-bad">The bad</h1>
<p>The entire AST type definitions need to be duplicated in the <code>AST</code> signature and <code>MakeAst</code> functor. Just this alone renders this feature useless for our purposes, as in any real programming language there will be a lot of AST types, and each type will be quite large too (with many fields and constructors).</p>
<p>There’s also a smaller-scale duplication in these lines:</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true"></a><span class="kw">module</span> MakeAst (Ext : AST_EXTENSIONS) :</span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true"></a>  AST</span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true"></a>    <span class="kw">with</span> <span class="kw">type</span> var_ext = Ext.var_ext</span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> app_ext = Ext.app_ext</span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> abs_ext = Ext.abs_ext</span>
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> let_ext = Ext.let_ext = <span class="kw">struct</span></span>
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true"></a>  <span class="kw">type</span> var_ext = Ext.var_ext</span>
<span id="cb11-8"><a href="#cb11-8" aria-hidden="true"></a>  <span class="kw">type</span> app_ext = Ext.app_ext</span>
<span id="cb11-9"><a href="#cb11-9" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext = Ext.abs_ext</span>
<span id="cb11-10"><a href="#cb11-10" aria-hidden="true"></a>  <span class="kw">type</span> let_ext = Ext.let_ext</span>
<span id="cb11-11"><a href="#cb11-11" aria-hidden="true"></a>  ...</span>
<span id="cb11-12"><a href="#cb11-12" aria-hidden="true"></a><span class="kw">end</span></span></code></pre></div>
<p>My understanding is that the types in the <code>struct ... end</code> part are abstract, i.e. not visible outside of the module (similar to existentials), and the <code>: AST with type ...</code> part specifies the returned module signature, i.e. the public interface. They need to be in sync, but they also need to be specified separately.</p>
<p>The only solution I can think of to these duplications is generating code, but if I’m OK with generating code, that opens up a lot of possibilities, and I don’t need functors anymore. I could even generate the full modules with all the AST types and everything else directly, without using functors.</p>
<p>So in short, OCaml modules helps quite a bit, but they’re held back by the issues with code duplication.</p>
<hr />
<details>
<p><summary>Full code (tested with OCaml 5.3.0)</summary></p>
<div class="sourceCode" id="cb12"><pre class="sourceCode ocaml"><code class="sourceCode ocaml"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true"></a><span class="co">(* Tested with OCaml 5.3.0. *)</span></span>
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true"></a></span>
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true"></a><span class="kw">module</span> <span class="kw">type</span> AST_EXTENSIONS = <span class="kw">sig</span></span>
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true"></a>  <span class="kw">type</span> var_ext</span>
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true"></a>  <span class="kw">type</span> app_ext</span>
<span id="cb12-6"><a href="#cb12-6" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext</span>
<span id="cb12-7"><a href="#cb12-7" aria-hidden="true"></a>  <span class="kw">type</span> let_ext</span>
<span id="cb12-8"><a href="#cb12-8" aria-hidden="true"></a></span>
<span id="cb12-9"><a href="#cb12-9" aria-hidden="true"></a>  <span class="kw">val</span> default_var_ext : var_ext</span>
<span id="cb12-10"><a href="#cb12-10" aria-hidden="true"></a>  <span class="kw">val</span> default_app_ext : app_ext</span>
<span id="cb12-11"><a href="#cb12-11" aria-hidden="true"></a>  <span class="kw">val</span> default_abs_ext : abs_ext</span>
<span id="cb12-12"><a href="#cb12-12" aria-hidden="true"></a>  <span class="kw">val</span> default_let_ext : let_ext</span>
<span id="cb12-13"><a href="#cb12-13" aria-hidden="true"></a><span class="kw">end</span></span>
<span id="cb12-14"><a href="#cb12-14" aria-hidden="true"></a></span>
<span id="cb12-15"><a href="#cb12-15" aria-hidden="true"></a><span class="kw">module</span> <span class="kw">type</span> AST = <span class="kw">sig</span></span>
<span id="cb12-16"><a href="#cb12-16" aria-hidden="true"></a>  <span class="kw">include</span> AST_EXTENSIONS</span>
<span id="cb12-17"><a href="#cb12-17" aria-hidden="true"></a></span>
<span id="cb12-18"><a href="#cb12-18" aria-hidden="true"></a>  <span class="kw">type</span> expr = Var <span class="kw">of</span> var | App <span class="kw">of</span> app | Abs <span class="kw">of</span> <span class="dt">abs</span> | Let <span class="kw">of</span> let_</span>
<span id="cb12-19"><a href="#cb12-19" aria-hidden="true"></a>  <span class="kw">and</span> var = { name : <span class="dt">string</span>; var_ext : var_ext }</span>
<span id="cb12-20"><a href="#cb12-20" aria-hidden="true"></a>  <span class="kw">and</span> app = { fn : expr; arg : expr; app_ext : app_ext }</span>
<span id="cb12-21"><a href="#cb12-21" aria-hidden="true"></a>  <span class="kw">and</span> <span class="dt">abs</span> = { param : <span class="dt">string</span>; body : expr; abs_ext : abs_ext }</span>
<span id="cb12-22"><a href="#cb12-22" aria-hidden="true"></a>  <span class="kw">and</span> let_ = { bound : <span class="dt">string</span>; rhs : expr; body : expr; let_ext : let_ext }</span>
<span id="cb12-23"><a href="#cb12-23" aria-hidden="true"></a><span class="kw">end</span></span>
<span id="cb12-24"><a href="#cb12-24" aria-hidden="true"></a></span>
<span id="cb12-25"><a href="#cb12-25" aria-hidden="true"></a><span class="kw">module</span> MakeAst (Ext : AST_EXTENSIONS) :</span>
<span id="cb12-26"><a href="#cb12-26" aria-hidden="true"></a>  AST</span>
<span id="cb12-27"><a href="#cb12-27" aria-hidden="true"></a>    <span class="kw">with</span> <span class="kw">type</span> var_ext = Ext.var_ext</span>
<span id="cb12-28"><a href="#cb12-28" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> app_ext = Ext.app_ext</span>
<span id="cb12-29"><a href="#cb12-29" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> abs_ext = Ext.abs_ext</span>
<span id="cb12-30"><a href="#cb12-30" aria-hidden="true"></a>     <span class="kw">and</span> <span class="kw">type</span> let_ext = Ext.let_ext = <span class="kw">struct</span></span>
<span id="cb12-31"><a href="#cb12-31" aria-hidden="true"></a>  <span class="kw">type</span> var_ext = Ext.var_ext</span>
<span id="cb12-32"><a href="#cb12-32" aria-hidden="true"></a>  <span class="kw">type</span> app_ext = Ext.app_ext</span>
<span id="cb12-33"><a href="#cb12-33" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext = Ext.abs_ext</span>
<span id="cb12-34"><a href="#cb12-34" aria-hidden="true"></a>  <span class="kw">type</span> let_ext = Ext.let_ext</span>
<span id="cb12-35"><a href="#cb12-35" aria-hidden="true"></a></span>
<span id="cb12-36"><a href="#cb12-36" aria-hidden="true"></a>  <span class="kw">let</span> default_var_ext = Ext.default_var_ext</span>
<span id="cb12-37"><a href="#cb12-37" aria-hidden="true"></a>  <span class="kw">let</span> default_app_ext = Ext.default_app_ext</span>
<span id="cb12-38"><a href="#cb12-38" aria-hidden="true"></a>  <span class="kw">let</span> default_abs_ext = Ext.default_abs_ext</span>
<span id="cb12-39"><a href="#cb12-39" aria-hidden="true"></a>  <span class="kw">let</span> default_let_ext = Ext.default_let_ext</span>
<span id="cb12-40"><a href="#cb12-40" aria-hidden="true"></a></span>
<span id="cb12-41"><a href="#cb12-41" aria-hidden="true"></a>  <span class="kw">type</span> expr = Var <span class="kw">of</span> var | App <span class="kw">of</span> app | Abs <span class="kw">of</span> <span class="dt">abs</span> | Let <span class="kw">of</span> let_</span>
<span id="cb12-42"><a href="#cb12-42" aria-hidden="true"></a>  <span class="kw">and</span> var = { name : <span class="dt">string</span>; var_ext : Ext.var_ext }</span>
<span id="cb12-43"><a href="#cb12-43" aria-hidden="true"></a>  <span class="kw">and</span> app = { fn : expr; arg : expr; app_ext : app_ext }</span>
<span id="cb12-44"><a href="#cb12-44" aria-hidden="true"></a>  <span class="kw">and</span> <span class="dt">abs</span> = { param : <span class="dt">string</span>; body : expr; abs_ext : abs_ext }</span>
<span id="cb12-45"><a href="#cb12-45" aria-hidden="true"></a>  <span class="kw">and</span> let_ = { bound : <span class="dt">string</span>; rhs : expr; body : expr; let_ext : let_ext }</span>
<span id="cb12-46"><a href="#cb12-46" aria-hidden="true"></a><span class="kw">end</span></span>
<span id="cb12-47"><a href="#cb12-47" aria-hidden="true"></a></span>
<span id="cb12-48"><a href="#cb12-48" aria-hidden="true"></a><span class="co">(* --------------------------------------------------------</span></span>
<span id="cb12-49"><a href="#cb12-49" aria-hidden="true"></a><span class="co">   A simple recursive-descent parser, generic over any AST.</span></span>
<span id="cb12-50"><a href="#cb12-50" aria-hidden="true"></a></span>
<span id="cb12-51"><a href="#cb12-51" aria-hidden="true"></a><span class="co">   Grammar:</span></span>
<span id="cb12-52"><a href="#cb12-52" aria-hidden="true"></a><span class="co">     expr   ::= &#39;let&#39; IDENT &#39;=&#39; expr &#39;in&#39; expr</span></span>
<span id="cb12-53"><a href="#cb12-53" aria-hidden="true"></a><span class="co">              | &#39;\&#39; IDENT &#39;.&#39; expr</span></span>
<span id="cb12-54"><a href="#cb12-54" aria-hidden="true"></a><span class="co">              | app</span></span>
<span id="cb12-55"><a href="#cb12-55" aria-hidden="true"></a><span class="co">     app    ::= atom+</span></span>
<span id="cb12-56"><a href="#cb12-56" aria-hidden="true"></a><span class="co">     atom   ::= IDENT | &#39;(&#39; expr &#39;)&#39;</span></span>
<span id="cb12-57"><a href="#cb12-57" aria-hidden="true"></a><span class="co">   -------------------------------------------------------- *)</span></span>
<span id="cb12-58"><a href="#cb12-58" aria-hidden="true"></a><span class="kw">module</span> Parse (A : AST) = <span class="kw">struct</span></span>
<span id="cb12-59"><a href="#cb12-59" aria-hidden="true"></a>  <span class="kw">type</span> tokens = <span class="dt">string</span> <span class="dt">list</span></span>
<span id="cb12-60"><a href="#cb12-60" aria-hidden="true"></a></span>
<span id="cb12-61"><a href="#cb12-61" aria-hidden="true"></a>  <span class="co">(* parse_expr: top-level, handles let/lambda/application.</span></span>
<span id="cb12-62"><a href="#cb12-62" aria-hidden="true"></a><span class="co">     Lambda and let bodies extend as far right as possible</span></span>
<span id="cb12-63"><a href="#cb12-63" aria-hidden="true"></a><span class="co">     (i.e. parse_expr), so nested constructs work without parens:</span></span>
<span id="cb12-64"><a href="#cb12-64" aria-hidden="true"></a><span class="co">       let f = \x. \y. x in ...</span></span>
<span id="cb12-65"><a href="#cb12-65" aria-hidden="true"></a><span class="co">       \x. \y. x y</span></span>
<span id="cb12-66"><a href="#cb12-66" aria-hidden="true"></a><span class="co">     parse_app_args stops at &#39;in&#39;, &#39;)&#39;, and non-atom tokens,</span></span>
<span id="cb12-67"><a href="#cb12-67" aria-hidden="true"></a><span class="co">     so &#39;in&#39; correctly terminates a let-RHS that is an application. *)</span></span>
<span id="cb12-68"><a href="#cb12-68" aria-hidden="true"></a>  <span class="kw">let</span> <span class="kw">rec</span> parse_expr (toks : tokens) : A.expr * tokens =</span>
<span id="cb12-69"><a href="#cb12-69" aria-hidden="true"></a>    <span class="kw">match</span> toks <span class="kw">with</span></span>
<span id="cb12-70"><a href="#cb12-70" aria-hidden="true"></a>    | <span class="st">&quot;let&quot;</span> :: name :: <span class="st">&quot;=&quot;</span> :: rest -&gt; (</span>
<span id="cb12-71"><a href="#cb12-71" aria-hidden="true"></a>        <span class="kw">let</span> rhs, rest = parse_expr rest <span class="kw">in</span></span>
<span id="cb12-72"><a href="#cb12-72" aria-hidden="true"></a>        <span class="kw">match</span> rest <span class="kw">with</span></span>
<span id="cb12-73"><a href="#cb12-73" aria-hidden="true"></a>        | <span class="st">&quot;in&quot;</span> :: rest -&gt;</span>
<span id="cb12-74"><a href="#cb12-74" aria-hidden="true"></a>            <span class="kw">let</span> body, rest = parse_expr rest <span class="kw">in</span></span>
<span id="cb12-75"><a href="#cb12-75" aria-hidden="true"></a>            ( A.Let { bound = name; rhs; body; let_ext = A.default_let_ext },</span>
<span id="cb12-76"><a href="#cb12-76" aria-hidden="true"></a>              rest )</span>
<span id="cb12-77"><a href="#cb12-77" aria-hidden="true"></a>        | _ -&gt; <span class="dt">failwith</span> <span class="st">&quot;expected &#39;in&#39;&quot;</span>)</span>
<span id="cb12-78"><a href="#cb12-78" aria-hidden="true"></a>    | <span class="st">&quot;</span><span class="ch">\\</span><span class="st">&quot;</span> :: param :: <span class="st">&quot;.&quot;</span> :: rest -&gt;</span>
<span id="cb12-79"><a href="#cb12-79" aria-hidden="true"></a>        <span class="kw">let</span> body, rest = parse_expr rest <span class="kw">in</span></span>
<span id="cb12-80"><a href="#cb12-80" aria-hidden="true"></a>        (A.Abs { param; body; abs_ext = A.default_abs_ext }, rest)</span>
<span id="cb12-81"><a href="#cb12-81" aria-hidden="true"></a>    | _ -&gt; parse_app toks</span>
<span id="cb12-82"><a href="#cb12-82" aria-hidden="true"></a></span>
<span id="cb12-83"><a href="#cb12-83" aria-hidden="true"></a>  <span class="kw">and</span> parse_app (toks : tokens) : A.expr * tokens =</span>
<span id="cb12-84"><a href="#cb12-84" aria-hidden="true"></a>    <span class="kw">let</span> head, rest = parse_atom toks <span class="kw">in</span></span>
<span id="cb12-85"><a href="#cb12-85" aria-hidden="true"></a>    parse_app_args head rest</span>
<span id="cb12-86"><a href="#cb12-86" aria-hidden="true"></a></span>
<span id="cb12-87"><a href="#cb12-87" aria-hidden="true"></a>  <span class="kw">and</span> parse_app_args (fn : A.expr) (toks : tokens) : A.expr * tokens =</span>
<span id="cb12-88"><a href="#cb12-88" aria-hidden="true"></a>    <span class="kw">match</span> toks <span class="kw">with</span></span>
<span id="cb12-89"><a href="#cb12-89" aria-hidden="true"></a>    | [] | <span class="st">&quot;)&quot;</span> :: _ | <span class="st">&quot;in&quot;</span> :: _ -&gt; (fn, toks)</span>
<span id="cb12-90"><a href="#cb12-90" aria-hidden="true"></a>    | _ -&gt; (</span>
<span id="cb12-91"><a href="#cb12-91" aria-hidden="true"></a>        <span class="kw">match</span> parse_atom_opt toks <span class="kw">with</span></span>
<span id="cb12-92"><a href="#cb12-92" aria-hidden="true"></a>        | <span class="dt">Some</span> (arg, rest) -&gt;</span>
<span id="cb12-93"><a href="#cb12-93" aria-hidden="true"></a>            <span class="kw">let</span> node = A.App { fn; arg; app_ext = A.default_app_ext } <span class="kw">in</span></span>
<span id="cb12-94"><a href="#cb12-94" aria-hidden="true"></a>            parse_app_args node rest</span>
<span id="cb12-95"><a href="#cb12-95" aria-hidden="true"></a>        | <span class="dt">None</span> -&gt; (fn, toks))</span>
<span id="cb12-96"><a href="#cb12-96" aria-hidden="true"></a></span>
<span id="cb12-97"><a href="#cb12-97" aria-hidden="true"></a>  <span class="kw">and</span> parse_atom (toks : tokens) : A.expr * tokens =</span>
<span id="cb12-98"><a href="#cb12-98" aria-hidden="true"></a>    <span class="kw">match</span> parse_atom_opt toks <span class="kw">with</span></span>
<span id="cb12-99"><a href="#cb12-99" aria-hidden="true"></a>    | <span class="dt">Some</span> r -&gt; r</span>
<span id="cb12-100"><a href="#cb12-100" aria-hidden="true"></a>    | <span class="dt">None</span> -&gt;</span>
<span id="cb12-101"><a href="#cb12-101" aria-hidden="true"></a>        <span class="kw">let</span> tok = <span class="kw">match</span> toks <span class="kw">with</span> t :: _ -&gt; t | [] -&gt; <span class="st">&quot;EOF&quot;</span> <span class="kw">in</span></span>
<span id="cb12-102"><a href="#cb12-102" aria-hidden="true"></a>        <span class="dt">failwith</span> (<span class="dt">Printf</span>.sprintf <span class="st">&quot;expected atom, got &#39;%s&#39;&quot;</span> tok)</span>
<span id="cb12-103"><a href="#cb12-103" aria-hidden="true"></a></span>
<span id="cb12-104"><a href="#cb12-104" aria-hidden="true"></a>  <span class="kw">and</span> parse_atom_opt (toks : tokens) : (A.expr * tokens) <span class="dt">option</span> =</span>
<span id="cb12-105"><a href="#cb12-105" aria-hidden="true"></a>    <span class="kw">match</span> toks <span class="kw">with</span></span>
<span id="cb12-106"><a href="#cb12-106" aria-hidden="true"></a>    | <span class="st">&quot;(&quot;</span> :: rest -&gt; (</span>
<span id="cb12-107"><a href="#cb12-107" aria-hidden="true"></a>        <span class="kw">let</span> e, rest = parse_expr rest <span class="kw">in</span></span>
<span id="cb12-108"><a href="#cb12-108" aria-hidden="true"></a>        <span class="kw">match</span> rest <span class="kw">with</span></span>
<span id="cb12-109"><a href="#cb12-109" aria-hidden="true"></a>        | <span class="st">&quot;)&quot;</span> :: rest -&gt; <span class="dt">Some</span> (e, rest)</span>
<span id="cb12-110"><a href="#cb12-110" aria-hidden="true"></a>        | _ -&gt; <span class="dt">failwith</span> <span class="st">&quot;expected &#39;)&#39;&quot;</span>)</span>
<span id="cb12-111"><a href="#cb12-111" aria-hidden="true"></a>    | tok :: rest</span>
<span id="cb12-112"><a href="#cb12-112" aria-hidden="true"></a>      <span class="kw">when</span> tok &lt;&gt; <span class="st">&quot;let&quot;</span> &amp;&amp; tok &lt;&gt; <span class="st">&quot;</span><span class="ch">\\</span><span class="st">&quot;</span> &amp;&amp; tok &lt;&gt; <span class="st">&quot;in&quot;</span> &amp;&amp; tok &lt;&gt; <span class="st">&quot;=&quot;</span></span>
<span id="cb12-113"><a href="#cb12-113" aria-hidden="true"></a>           &amp;&amp; tok &lt;&gt; <span class="st">&quot;.&quot;</span> &amp;&amp; tok &lt;&gt; <span class="st">&quot;(&quot;</span> &amp;&amp; tok &lt;&gt; <span class="st">&quot;)&quot;</span> -&gt;</span>
<span id="cb12-114"><a href="#cb12-114" aria-hidden="true"></a>        <span class="dt">Some</span> (A.Var { name = tok; var_ext = A.default_var_ext }, rest)</span>
<span id="cb12-115"><a href="#cb12-115" aria-hidden="true"></a>    | _ -&gt; <span class="dt">None</span></span>
<span id="cb12-116"><a href="#cb12-116" aria-hidden="true"></a></span>
<span id="cb12-117"><a href="#cb12-117" aria-hidden="true"></a>  <span class="kw">let</span> parse (<span class="dt">input</span> : <span class="dt">string</span>) : A.expr =</span>
<span id="cb12-118"><a href="#cb12-118" aria-hidden="true"></a>    <span class="co">(* Tokenize: split on whitespace, treat parens as separate tokens *)</span></span>
<span id="cb12-119"><a href="#cb12-119" aria-hidden="true"></a>    <span class="kw">let</span> buf = <span class="dt">Buffer</span>.create (<span class="dt">String</span>.length <span class="dt">input</span>) <span class="kw">in</span></span>
<span id="cb12-120"><a href="#cb12-120" aria-hidden="true"></a>    <span class="dt">String</span>.iter</span>
<span id="cb12-121"><a href="#cb12-121" aria-hidden="true"></a>      (<span class="kw">fun</span> c -&gt;</span>
<span id="cb12-122"><a href="#cb12-122" aria-hidden="true"></a>        <span class="kw">match</span> c <span class="kw">with</span></span>
<span id="cb12-123"><a href="#cb12-123" aria-hidden="true"></a>        | <span class="ch">&#39;(&#39;</span> | <span class="ch">&#39;)&#39;</span> | <span class="ch">&#39;.&#39;</span> | <span class="ch">&#39;\\&#39;</span> -&gt;</span>
<span id="cb12-124"><a href="#cb12-124" aria-hidden="true"></a>            <span class="dt">Buffer</span>.add_char buf <span class="ch">&#39; &#39;</span>;</span>
<span id="cb12-125"><a href="#cb12-125" aria-hidden="true"></a>            <span class="dt">Buffer</span>.add_char buf c;</span>
<span id="cb12-126"><a href="#cb12-126" aria-hidden="true"></a>            <span class="dt">Buffer</span>.add_char buf <span class="ch">&#39; &#39;</span></span>
<span id="cb12-127"><a href="#cb12-127" aria-hidden="true"></a>        | _ -&gt; <span class="dt">Buffer</span>.add_char buf c)</span>
<span id="cb12-128"><a href="#cb12-128" aria-hidden="true"></a>      <span class="dt">input</span>;</span>
<span id="cb12-129"><a href="#cb12-129" aria-hidden="true"></a>    <span class="kw">let</span> s = <span class="dt">Buffer</span>.contents buf <span class="kw">in</span></span>
<span id="cb12-130"><a href="#cb12-130" aria-hidden="true"></a>    <span class="kw">let</span> toks = <span class="dt">String</span>.split_on_char <span class="ch">&#39; &#39;</span> s |&gt; <span class="dt">List</span>.filter (<span class="kw">fun</span> s -&gt; s &lt;&gt; <span class="st">&quot;&quot;</span>) <span class="kw">in</span></span>
<span id="cb12-131"><a href="#cb12-131" aria-hidden="true"></a>    <span class="kw">let</span> expr, rest = parse_expr toks <span class="kw">in</span></span>
<span id="cb12-132"><a href="#cb12-132" aria-hidden="true"></a>    <span class="kw">if</span> rest &lt;&gt; [] <span class="kw">then</span></span>
<span id="cb12-133"><a href="#cb12-133" aria-hidden="true"></a>      <span class="dt">failwith</span> (<span class="dt">Printf</span>.sprintf <span class="st">&quot;unexpected token &#39;%s&#39;&quot;</span> (<span class="dt">List</span>.hd rest));</span>
<span id="cb12-134"><a href="#cb12-134" aria-hidden="true"></a>    expr</span>
<span id="cb12-135"><a href="#cb12-135" aria-hidden="true"></a><span class="kw">end</span></span>
<span id="cb12-136"><a href="#cb12-136" aria-hidden="true"></a></span>
<span id="cb12-137"><a href="#cb12-137" aria-hidden="true"></a><span class="co">(* --------------------------------------------------------</span></span>
<span id="cb12-138"><a href="#cb12-138" aria-hidden="true"></a><span class="co">   Formatter — all extensions are unit.</span></span>
<span id="cb12-139"><a href="#cb12-139" aria-hidden="true"></a><span class="co">   -------------------------------------------------------- *)</span></span>
<span id="cb12-140"><a href="#cb12-140" aria-hidden="true"></a><span class="kw">module</span> FmtAst = MakeAst (<span class="kw">struct</span></span>
<span id="cb12-141"><a href="#cb12-141" aria-hidden="true"></a>  <span class="kw">type</span> var_ext = <span class="dt">unit</span></span>
<span id="cb12-142"><a href="#cb12-142" aria-hidden="true"></a>  <span class="kw">type</span> app_ext = <span class="dt">unit</span></span>
<span id="cb12-143"><a href="#cb12-143" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext = <span class="dt">unit</span></span>
<span id="cb12-144"><a href="#cb12-144" aria-hidden="true"></a>  <span class="kw">type</span> let_ext = <span class="dt">unit</span></span>
<span id="cb12-145"><a href="#cb12-145" aria-hidden="true"></a></span>
<span id="cb12-146"><a href="#cb12-146" aria-hidden="true"></a>  <span class="kw">let</span> default_var_ext = ()</span>
<span id="cb12-147"><a href="#cb12-147" aria-hidden="true"></a>  <span class="kw">let</span> default_app_ext = ()</span>
<span id="cb12-148"><a href="#cb12-148" aria-hidden="true"></a>  <span class="kw">let</span> default_abs_ext = ()</span>
<span id="cb12-149"><a href="#cb12-149" aria-hidden="true"></a>  <span class="kw">let</span> default_let_ext = ()</span>
<span id="cb12-150"><a href="#cb12-150" aria-hidden="true"></a><span class="kw">end</span>)</span>
<span id="cb12-151"><a href="#cb12-151" aria-hidden="true"></a></span>
<span id="cb12-152"><a href="#cb12-152" aria-hidden="true"></a><span class="kw">module</span> FmtParse = Parse (FmtAst)</span>
<span id="cb12-153"><a href="#cb12-153" aria-hidden="true"></a></span>
<span id="cb12-154"><a href="#cb12-154" aria-hidden="true"></a><span class="kw">let</span> <span class="kw">rec</span> format_expr (e : FmtAst.expr) : <span class="dt">string</span> =</span>
<span id="cb12-155"><a href="#cb12-155" aria-hidden="true"></a>  <span class="kw">match</span> e <span class="kw">with</span></span>
<span id="cb12-156"><a href="#cb12-156" aria-hidden="true"></a>  | Var { name; _ } -&gt; name</span>
<span id="cb12-157"><a href="#cb12-157" aria-hidden="true"></a>  | App { fn; arg; _ } -&gt;</span>
<span id="cb12-158"><a href="#cb12-158" aria-hidden="true"></a>      <span class="dt">Printf</span>.sprintf <span class="st">&quot;(%s %s)&quot;</span> (format_expr fn) (format_arg arg)</span>
<span id="cb12-159"><a href="#cb12-159" aria-hidden="true"></a>  | Abs { param; body; _ } -&gt;</span>
<span id="cb12-160"><a href="#cb12-160" aria-hidden="true"></a>      <span class="dt">Printf</span>.sprintf <span class="st">&quot;(</span><span class="ch">\\</span><span class="st">%s. %s)&quot;</span> param (format_expr body)</span>
<span id="cb12-161"><a href="#cb12-161" aria-hidden="true"></a>  | Let { bound; rhs; body; _ } -&gt;</span>
<span id="cb12-162"><a href="#cb12-162" aria-hidden="true"></a>      <span class="dt">Printf</span>.sprintf <span class="st">&quot;(let %s = %s in %s)&quot;</span> bound (format_expr rhs)</span>
<span id="cb12-163"><a href="#cb12-163" aria-hidden="true"></a>        (format_expr body)</span>
<span id="cb12-164"><a href="#cb12-164" aria-hidden="true"></a></span>
<span id="cb12-165"><a href="#cb12-165" aria-hidden="true"></a><span class="kw">and</span> format_arg (e : FmtAst.expr) : <span class="dt">string</span> =</span>
<span id="cb12-166"><a href="#cb12-166" aria-hidden="true"></a>  <span class="kw">match</span> e <span class="kw">with</span></span>
<span id="cb12-167"><a href="#cb12-167" aria-hidden="true"></a>  | Var { name; _ } -&gt; name</span>
<span id="cb12-168"><a href="#cb12-168" aria-hidden="true"></a>  | _ -&gt; <span class="dt">Printf</span>.sprintf <span class="st">&quot;(%s)&quot;</span> (format_expr e)</span>
<span id="cb12-169"><a href="#cb12-169" aria-hidden="true"></a></span>
<span id="cb12-170"><a href="#cb12-170" aria-hidden="true"></a><span class="co">(* --------------------------------------------------------</span></span>
<span id="cb12-171"><a href="#cb12-171" aria-hidden="true"></a><span class="co">   Type checker — extensions carry inferred types.</span></span>
<span id="cb12-172"><a href="#cb12-172" aria-hidden="true"></a><span class="co">   -------------------------------------------------------- *)</span></span>
<span id="cb12-173"><a href="#cb12-173" aria-hidden="true"></a><span class="kw">type</span> ty = TyVar <span class="kw">of</span> <span class="dt">string</span> | TyArrow <span class="kw">of</span> ty * ty</span>
<span id="cb12-174"><a href="#cb12-174" aria-hidden="true"></a><span class="kw">type</span> tc_var_ext = { inferred_type : ty <span class="dt">option</span> }</span>
<span id="cb12-175"><a href="#cb12-175" aria-hidden="true"></a><span class="kw">type</span> tc_app_ext = { result_type : ty <span class="dt">option</span> }</span>
<span id="cb12-176"><a href="#cb12-176" aria-hidden="true"></a><span class="kw">type</span> tc_abs_ext = { param_type : ty <span class="dt">option</span> }</span>
<span id="cb12-177"><a href="#cb12-177" aria-hidden="true"></a><span class="kw">type</span> tc_let_ext = { bound_type : ty <span class="dt">option</span> }</span>
<span id="cb12-178"><a href="#cb12-178" aria-hidden="true"></a></span>
<span id="cb12-179"><a href="#cb12-179" aria-hidden="true"></a><span class="kw">module</span> TcAst = MakeAst (<span class="kw">struct</span></span>
<span id="cb12-180"><a href="#cb12-180" aria-hidden="true"></a>  <span class="kw">type</span> var_ext = tc_var_ext</span>
<span id="cb12-181"><a href="#cb12-181" aria-hidden="true"></a>  <span class="kw">type</span> app_ext = tc_app_ext</span>
<span id="cb12-182"><a href="#cb12-182" aria-hidden="true"></a>  <span class="kw">type</span> abs_ext = tc_abs_ext</span>
<span id="cb12-183"><a href="#cb12-183" aria-hidden="true"></a>  <span class="kw">type</span> let_ext = tc_let_ext</span>
<span id="cb12-184"><a href="#cb12-184" aria-hidden="true"></a></span>
<span id="cb12-185"><a href="#cb12-185" aria-hidden="true"></a>  <span class="kw">let</span> default_var_ext = { inferred_type = <span class="dt">None</span> }</span>
<span id="cb12-186"><a href="#cb12-186" aria-hidden="true"></a>  <span class="kw">let</span> default_app_ext = { result_type = <span class="dt">None</span> }</span>
<span id="cb12-187"><a href="#cb12-187" aria-hidden="true"></a>  <span class="kw">let</span> default_abs_ext = { param_type = <span class="dt">None</span> }</span>
<span id="cb12-188"><a href="#cb12-188" aria-hidden="true"></a>  <span class="kw">let</span> default_let_ext = { bound_type = <span class="dt">None</span> }</span>
<span id="cb12-189"><a href="#cb12-189" aria-hidden="true"></a><span class="kw">end</span>)</span>
<span id="cb12-190"><a href="#cb12-190" aria-hidden="true"></a></span>
<span id="cb12-191"><a href="#cb12-191" aria-hidden="true"></a><span class="kw">module</span> TcParse = Parse (TcAst)</span>
<span id="cb12-192"><a href="#cb12-192" aria-hidden="true"></a></span>
<span id="cb12-193"><a href="#cb12-193" aria-hidden="true"></a><span class="kw">let</span> <span class="kw">rec</span> format_ty (t : ty) : <span class="dt">string</span> =</span>
<span id="cb12-194"><a href="#cb12-194" aria-hidden="true"></a>  <span class="kw">match</span> t <span class="kw">with</span></span>
<span id="cb12-195"><a href="#cb12-195" aria-hidden="true"></a>  | TyVar s -&gt; s</span>
<span id="cb12-196"><a href="#cb12-196" aria-hidden="true"></a>  | TyArrow ((TyArrow _ <span class="kw">as</span> a), b) -&gt;</span>
<span id="cb12-197"><a href="#cb12-197" aria-hidden="true"></a>      <span class="dt">Printf</span>.sprintf <span class="st">&quot;(%s) -&gt; %s&quot;</span> (format_ty a) (format_ty b)</span>
<span id="cb12-198"><a href="#cb12-198" aria-hidden="true"></a>  | TyArrow (a, b) -&gt; <span class="dt">Printf</span>.sprintf <span class="st">&quot;%s -&gt; %s&quot;</span> (format_ty a) (format_ty b)</span>
<span id="cb12-199"><a href="#cb12-199" aria-hidden="true"></a></span>
<span id="cb12-200"><a href="#cb12-200" aria-hidden="true"></a><span class="co">(* Placeholder: just read off the extension annotation if present. *)</span></span>
<span id="cb12-201"><a href="#cb12-201" aria-hidden="true"></a><span class="kw">let</span> <span class="kw">rec</span> check_expr (e : TcAst.expr) : ty =</span>
<span id="cb12-202"><a href="#cb12-202" aria-hidden="true"></a>  <span class="kw">match</span> e <span class="kw">with</span></span>
<span id="cb12-203"><a href="#cb12-203" aria-hidden="true"></a>  | Var { var_ext = { inferred_type = <span class="dt">Some</span> t }; _ } -&gt; t</span>
<span id="cb12-204"><a href="#cb12-204" aria-hidden="true"></a>  | Var { name; _ } -&gt; TyVar name</span>
<span id="cb12-205"><a href="#cb12-205" aria-hidden="true"></a>  | App { app_ext = { result_type = <span class="dt">Some</span> t }; _ } -&gt; t</span>
<span id="cb12-206"><a href="#cb12-206" aria-hidden="true"></a>  | App { fn; _ } -&gt; (</span>
<span id="cb12-207"><a href="#cb12-207" aria-hidden="true"></a>      <span class="kw">match</span> check_expr fn <span class="kw">with</span> TyArrow (_, ret) -&gt; ret | t -&gt; t)</span>
<span id="cb12-208"><a href="#cb12-208" aria-hidden="true"></a>  | Abs { param; body; abs_ext = { param_type }; _ } -&gt;</span>
<span id="cb12-209"><a href="#cb12-209" aria-hidden="true"></a>      <span class="kw">let</span> p = <span class="kw">match</span> param_type <span class="kw">with</span> <span class="dt">Some</span> t -&gt; t | <span class="dt">None</span> -&gt; TyVar param <span class="kw">in</span></span>
<span id="cb12-210"><a href="#cb12-210" aria-hidden="true"></a>      TyArrow (p, check_expr body)</span>
<span id="cb12-211"><a href="#cb12-211" aria-hidden="true"></a>  | Let { body; _ } -&gt; check_expr body</span>
<span id="cb12-212"><a href="#cb12-212" aria-hidden="true"></a></span>
<span id="cb12-213"><a href="#cb12-213" aria-hidden="true"></a><span class="co">(* --------------------------------------------------------</span></span>
<span id="cb12-214"><a href="#cb12-214" aria-hidden="true"></a><span class="co">   Generic node counter — works on any AST.</span></span>
<span id="cb12-215"><a href="#cb12-215" aria-hidden="true"></a><span class="co">   -------------------------------------------------------- *)</span></span>
<span id="cb12-216"><a href="#cb12-216" aria-hidden="true"></a><span class="kw">module</span> CountNodes (A : AST) = <span class="kw">struct</span></span>
<span id="cb12-217"><a href="#cb12-217" aria-hidden="true"></a>  <span class="kw">let</span> <span class="kw">rec</span> count (e : A.expr) : <span class="dt">int</span> =</span>
<span id="cb12-218"><a href="#cb12-218" aria-hidden="true"></a>    <span class="kw">match</span> e <span class="kw">with</span></span>
<span id="cb12-219"><a href="#cb12-219" aria-hidden="true"></a>    | Var _ -&gt; <span class="dv">1</span></span>
<span id="cb12-220"><a href="#cb12-220" aria-hidden="true"></a>    | App { fn; arg; _ } -&gt; <span class="dv">1</span> + count fn + count arg</span>
<span id="cb12-221"><a href="#cb12-221" aria-hidden="true"></a>    | Abs { body; _ } -&gt; <span class="dv">1</span> + count body</span>
<span id="cb12-222"><a href="#cb12-222" aria-hidden="true"></a>    | Let { rhs; body; _ } -&gt; <span class="dv">1</span> + count rhs + count body</span>
<span id="cb12-223"><a href="#cb12-223" aria-hidden="true"></a><span class="kw">end</span></span>
<span id="cb12-224"><a href="#cb12-224" aria-hidden="true"></a></span>
<span id="cb12-225"><a href="#cb12-225" aria-hidden="true"></a><span class="kw">module</span> CountFmt = CountNodes (FmtAst)</span>
<span id="cb12-226"><a href="#cb12-226" aria-hidden="true"></a><span class="kw">module</span> CountTc = CountNodes (TcAst)</span>
<span id="cb12-227"><a href="#cb12-227" aria-hidden="true"></a></span>
<span id="cb12-228"><a href="#cb12-228" aria-hidden="true"></a><span class="co">(* --------------------------------------------------------</span></span>
<span id="cb12-229"><a href="#cb12-229" aria-hidden="true"></a><span class="co">   Demo: parse the same source in both worlds.</span></span>
<span id="cb12-230"><a href="#cb12-230" aria-hidden="true"></a><span class="co">   -------------------------------------------------------- *)</span></span>
<span id="cb12-231"><a href="#cb12-231" aria-hidden="true"></a><span class="kw">let</span> source = {|<span class="kw">let</span> id = \x. x <span class="kw">in</span> id <span class="dv">42</span>|}</span>
<span id="cb12-232"><a href="#cb12-232" aria-hidden="true"></a></span>
<span id="cb12-233"><a href="#cb12-233" aria-hidden="true"></a><span class="kw">let</span> () =</span>
<span id="cb12-234"><a href="#cb12-234" aria-hidden="true"></a>  <span class="co">(* Formatter world — parse and pretty-print *)</span></span>
<span id="cb12-235"><a href="#cb12-235" aria-hidden="true"></a>  <span class="kw">let</span> prog = FmtParse.parse source <span class="kw">in</span></span>
<span id="cb12-236"><a href="#cb12-236" aria-hidden="true"></a>  <span class="dt">Printf</span>.printf <span class="st">&quot;formatted: %s</span><span class="ch">\n</span><span class="st">&quot;</span> (format_expr prog);</span>
<span id="cb12-237"><a href="#cb12-237" aria-hidden="true"></a>  <span class="dt">Printf</span>.printf <span class="st">&quot;node count: %d</span><span class="ch">\n</span><span class="st">&quot;</span> (CountFmt.count prog);</span>
<span id="cb12-238"><a href="#cb12-238" aria-hidden="true"></a></span>
<span id="cb12-239"><a href="#cb12-239" aria-hidden="true"></a>  <span class="co">(* Type checker world — parse (extensions default to None),</span></span>
<span id="cb12-240"><a href="#cb12-240" aria-hidden="true"></a><span class="co">     then check with the placeholder checker *)</span></span>
<span id="cb12-241"><a href="#cb12-241" aria-hidden="true"></a>  <span class="kw">let</span> tc_prog = TcParse.parse source <span class="kw">in</span></span>
<span id="cb12-242"><a href="#cb12-242" aria-hidden="true"></a>  <span class="dt">Printf</span>.printf <span class="st">&quot;inferred type: %s</span><span class="ch">\n</span><span class="st">&quot;</span> (format_ty (check_expr tc_prog));</span>
<span id="cb12-243"><a href="#cb12-243" aria-hidden="true"></a>  <span class="dt">Printf</span>.printf <span class="st">&quot;node count: %d</span><span class="ch">\n</span><span class="st">&quot;</span> (CountTc.count tc_prog)</span></code></pre></div>
</details>
<section class="footnotes" role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><p>Actually, any kind of polymorphism. Fir currently doesn’t have trait objects and the only way to have polymorphism is by using type parameters, potentially with qualifications.<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn2" role="doc-endnote"><p>This is a little bit simplified, see <a href="https://osa1.net/posts/2025-01-18-fir-error-handling.html">this post</a> for more details and examples.<a href="#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>]]></summary>
</entry>
<entry>
    <title>Extensible named types in Fir</title>
    <link href="http://osa1.net/posts/2026-03-07-extensible-named-types-fir.html" />
    <id>http://osa1.net/posts/2026-03-07-extensible-named-types-fir.html</id>
    <published>2026-03-07T00:00:00Z</published>
    <updated>2026-03-07T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>The front-end AST types are one of the most important types in a language implementation, and if we get them wrong nothing will be right in the rest of the implementation.</p>
<p>These types should be cheap to allocate and efficient to use, but also extensible, as different tools will use them differently. A type checker may want to add inferred types to expressions, but for a formatter, those inferred type fields would be a waste of memory.</p>
<p>One approach to this problem is to have a parser that generates parse events instead of an AST or CST, and let the tools have their own ASTs. I explored this in <a href="https://osa1.net/posts/2024-11-22-how-to-parse-1.html">a previous blog post</a>.</p>
<p>This approach works fine when the language is small, but for a programming language that’s never the case. Fir is currently quite simple, yet it has 28 types of expressions. Most production languages have many more.</p>
<p>So I’ve been thinking about making Fir’s AST types extensible with new fields in the self-hosted compiler. This AST will be used by many of the tools listed <a href="https://github.com/fir-lang/fir/issues/28">here</a>, and more. The parser and the AST types will be published as libraries.</p>
<p>There are a few common ways to add new fields to an existing type:</p>
<ul>
<li><p>With subtyping of nominal types (common in OOP languages), we can create a subtype with extra fields.</p></li>
<li><p>In languages where objects have identities (again, common in OOP languages), we can use an identity map to map objects to extra information.</p></li>
<li><p>If the objects don’t have identities, we can manually generate unique identities for objects that we want to attach extra information to, and then use a map, like in the previous option.</p></li>
</ul>
<p>(3) can be done in Fir, and it has a few advantages compared to extending existing types: <a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a></p>
<ul>
<li><p>The maps we use to attach extra information to AST nodes can be deallocated separately from the AST types. So if we have a long computation where we need some information in some of the steps but not later, we can allocate the maps and the deallocate while keeping the AST nodes alive.</p></li>
<li><p>We can create differently typed identities for different AST types, and generate the identities as consecutive numbers. Then use arrays instead of hash maps to map nodes to things.</p></li>
<li><p>Unlike built-in identities, we can choose the identity size (e.g. 32-bit numbers instead of 64-bit), and embed information about the values in the identities.</p></li>
</ul>
<p>I think this is probably the way to go in Fir’s self-hosted compiler, at least in the short term.</p>
<p>However while thinking about this I also found another way to extend types with more information, with row types.</p>
<h1 id="row-types-in-fir-today">Row types in Fir today</h1>
<p>Row types are mainly used for variants, which are the types that make exception handling in Fir <a href="https://osa1.net/posts/2025-01-18-fir-error-handling.html">safe, expressive, and convenient to use</a>.</p>
<p>A variant is just a set of types, e.g. <code>[U32, Str]</code> is a variant type with <code>U32</code> (32-bit unsigned integer) and <code>Str</code> (immutable, UTF-8 encoded unicode strings). Values of this type can be <code>U32</code>s or <code>Str</code>s.</p>
<p>The type <code>[U32, Str, ..r]</code> is the same as before, but it can have more types in it. When pattern matching a value of this type, we have to have a catch-all case handling the <code>..r</code> part, which represents extra types that the value may have.</p>
<p>To construct a variant value we just add a <code>~</code> prefix, e.g. <code>~123</code> gets the type <code>[U32, ..r]</code> (with a fresh <code>r</code>).</p>
<p>A crucial feature of variants in Fir is that they allow type refinement when pattern matching. If I have a variant value with type <code>[Bool, Str, ..r]</code>, and handle the <code>Bool</code>s in a pattern match and bind the rest to a variable, the variable gets a refined type:</p>
<pre><code>handleBools(arg: [Bool, Str, ..r]) [Str, ..r]:
    match arg:
        ~Bool.True: ~&quot;True&quot;
        ~Bool.False: ~&quot;False&quot;
        other: other</code></pre>
<p>Here the type of <code>other</code> is refined as <code>[Str, ..r]</code>, because the previous alternative of the <code>match</code> handles the <code>Bool</code> values, so at this point we know that the value can’t be a <code>Bool</code>. <a href="#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a></p>
<p>When variants are used as checked exceptions, this allows things like: catching some of the exceptions thrown by a function and propagating the rest. See the link at the beginning of this section for more examples.</p>
<p>Now, these row types that represent “extra stuff” can also be used in records, and Fir supports that too. For example, the function below can take any record that has at least <code>x: U32</code> and <code>y: U32</code> fields:</p>
<pre><code>printXY(record: (x: U32, y: U32, ..r)):
    print(&quot;x = `record.x`, y = `record.y`&quot;)

main():
    printXY((x = 1, y = 2))

    # Extra fields are OK:
    printXY((x = 3, y = 4, msg = &quot;hi&quot;))</code></pre>
<p>But I think this feature of records is not that useful. In Fir, records are also value types<a href="#fn3" class="footnote-ref" id="fnref3" role="doc-noteref"><sup>3</sup></a>, and the main use case for records is returning multiple values. And when returning multiple values that “extra fields” part of the record types is not useful. This is because we can’t return a record with the extension part (<code>..r</code>), unless that record is passed as an argument. Consider:</p>
<pre><code>returnExtensibleRecord() (x: U32, y: U32, ..r):
    ???</code></pre>
<p>There’s no non-divergent expression in the body that will make this type check.</p>
<p>This is different than variants, where a variant construction like <code>~"Hi"</code> will have type <code>[Str, ..r]</code> (with fresh <code>r</code>). So we can have this:</p>
<pre><code>returnVariant() [Str, ..r]:
    ~&quot;Hi&quot;</code></pre>
<p>In other words, rows in variants allow us to assume that a value may have some extra values, and there are many use cases where we want to do that (again, see the blog post linked at the beginning of this section).</p>
<p>Rows in records are for ignoring extra fields, which is not that useful if we assume that the main use case is to return more than one value from functions.</p>
<p>The reason why I implemented row extensions in records is that, once I had the type checker and monomorphiser that can deal with rows, it was straightforward to apply it to records as well.</p>
<p>It also allowed me to experiment with extensible types a bit more, which led to…</p>
<h1 id="a-new-use-case-for-rows">A new use case for rows?</h1>
<p>We can use the variant rows for extending sum types with new constructors, and record rows for extending product types with new fields. Here’s an example that works in Fir today: <a href="#fn4" class="footnote-ref" id="fnref4" role="doc-noteref"><sup>4</sup></a></p>
<pre><code>type Foo[r](
    x: U32,
    y: U32,
    ..r
)

main():
    let foo = Foo(x = 123, y = 456, z = &quot;hi&quot;)
    print(foo.x)
    print(foo.y)
    print(foo.z)</code></pre>
<p><code>Foo</code> is a named type. The <code>r</code> is a record row kinded type parameter, representing extra fields. The type inference infers type <code>Foo[row(z: Str)]</code> for the type of <code>foo</code>. We can access the field <code>z</code> just like any other field.</p>
<p>(The only difference between a record construction syntax and a named type constructor syntax is the missing name: <code>(x = 123, y = 456)</code> is a record, <code>Foo(x = 123, y = 456)</code> is a named type value.)</p>
<p>This gives us a way to extend product types. For example, in our AST, the expression node for binary operators may look like this:</p>
<pre><code>type BinOpExpr[extras](
    left: Expr,
    right: Expr,
    op: BinOp,
    ..extras
)</code></pre>
<p>The formatter could then use this as <code>BinOpExpr[row()]</code>, and the type checker could add an extra field for the inferred type of the expression with <code>BinOpExpr[row(inferredTy: Ty)]</code>.</p>
<p>The idea applies to the sum types the same way, however it’s currently not fully implemented in my prototype, because of syntax issues. Here’s how row extensions look like with sum types:</p>
<pre><code>type Expr[extras]:
    Var(VarExpr)
    BinOp(BinOpExpr)
    ..extras</code></pre>
<p>Now suppose I want to extend this type with the standard library <code>Bool</code> type:</p>
<pre><code>value type Bool:
    False
    True</code></pre>
<p>How should the extra values be constructed? The way we normally construct sum values is as <code>&lt;type&gt;.&lt;constructor&gt;(&lt;args&gt;)</code>, e.g. <code>Bool.True</code>, <code>Expr.BinOp(...)</code>.</p>
<p>But with a sum type extended with another sum type, I’m not sure what syntax to use for construction. I can see two options:</p>
<ul>
<li><code>Expr.Bool.True</code>: extended type, extension type, then constructor.</li>
<li><code>Expr.True</code>: extended type, then constructor.</li>
</ul>
<p>There’s also the issue of not all types having a constructor name. For example, with this syntax, we wouldn’t have a way of constructing a <code>Str</code> as <code>Expr[row[Str]]</code> as string literals are not constructed with the <code>&lt;constructor&gt;(&lt;args&gt;)</code> syntax.</p>
<p>In short, I couldn’t find a nice syntax for sum types with extensions, so they’re currently not implemented in my prototype.</p>
<h1 id="problems-and-features-needed">Problems and features needed</h1>
<p>This approach adds type parameters to types, and type parameters can be contagious. (propagated to the use sites, and their use sites, and theirs…)<a href="#fn5" class="footnote-ref" id="fnref5" role="doc-noteref"><sup>5</sup></a></p>
<p>Consider the statement type in Fir’s AST:</p>
<pre><code>type Stmt:
    Let(LetStmt)
    Assign(AssignStmt)
    Expr(Expr)
    For(ForStmt)
    While(WhileStmt)
    Loop(LoopStmt)
    Break(BreakStmt)
    Continue(ContinueStmt)</code></pre>
<p>To extend this I’ll need one type parameter per extension. If I have to extend <code>let</code> statements and <code>for</code> statements with different fields, I need two:</p>
<pre><code>type Stmt[letExts, forExts]:
    Let(LetStmt[letExts])
    For(ForStmt[forExts])
    ...</code></pre>
<p>It’s clear that this will scale poorly.</p>
<p>To keep the number of type parameter in check we could use something like type families (type-level functions) to have one type per use case (e.g. type checking, formatting), and then map those to different extension types, but I’m not sure if adding type-level functions just to support this feature makes sense.</p>
<p>Another issue is with <code>deriving</code>: we will have some way of deriving trait implementations, similar to Rust<a href="#fn6" class="footnote-ref" id="fnref6" role="doc-noteref"><sup>6</sup></a>. With row extensions, we can’t use a macro with just the item AST as the input, as the macro will just see type parameters for the extensions. We have to iterate the extension fields somehow in the derived code generator, and regardless of how we iterate the row fields, the actual code generation needs to be done during monomorphisation, as that’s when we know the full type arguments.</p>
<p>Finally, to properly type check this we have to extend the constraint language. Consider this:</p>
<pre><code>type Foo[r](
    f1: U32,
    f2: Str,
    ..r
)</code></pre>
<p>Here the constructor <code>Foo</code> will have the type: <code>Fn(f1: U32, f2: Str, ..r) Foo[r]</code><a href="#fn7" class="footnote-ref" id="fnref7" role="doc-noteref"><sup>7</sup></a>, but not all rows will be valid for <code>r</code>: we can’t allow overriding existing fields with different types<a href="#fn8" class="footnote-ref" id="fnref8" role="doc-noteref"><sup>8</sup></a>.</p>
<p>It’s easy to check the example above, but in general, these “lacks” constraints (i.e. “record row type <code>r</code> lacks fields <code>f1</code>, <code>f2</code>”) need to be carried over to the use sites of the type parameter to be able to type check properly. In our <code>type Stmt[letExts, forExts]: ...</code> above, the constraints will be coming from the <code>LetStmt</code> and <code>ForStmt</code> types, not from <code>Stmt</code>, and they need to be carried over to the use sites of <code>Stmt</code>.</p>
<p>Currently not having these constraints on the type parameters doesn’t cause soundness issues as the monomorphiser catches these issues, but it’s not ideal because it means that these errors wouldn’t be caught in the language server (which won’t fully compile, just type check), or when running <code>fir --typecheck &lt;file&gt;</code>. Error reporting is also not as good as error reporting in the type checker.</p>
<p>(The lack of “lacks” constraints is not a problem until this feature because variants can always be extended with any type (duplicate types are OK), and it’s not possible to extend records. At least currently, row types in records are only for forgetting/ignoring extra fields.)</p>
<p>Finally, to avoid repeatedly typing the same row type arguments in the use sites in the parser, formatter, etc. we need type synonyms. Fir currently doesn’t have type synonyms because I don’t think they’re that useful when we have value types, and I hate to deal with them in the type checker.<a href="#fn9" class="footnote-ref" id="fnref9" role="doc-noteref"><sup>9</sup></a> In our <code>Stmt</code> example above, we’ll want to write:</p>
<pre><code># Extensions for type checking.
alias TcLetStmtExts = row(inferredBinderType: Option[Ty])
alias TcForStmtExts = row(inferredIteratorType: Option[Ty])
alias TcLetStmt = LetStmt[TcLetStmtExts]
alias TcForStmt = ForStmt[TcForStmtExts]
alias TcStmt = Stmt[TcLetStmtExts, TcForStmtExts]
...

# Extensions for formatting.
alias FmtLetStmtExts = row()
alias FmtForStmtExts = row()
alias FmtLetStmt = LetStmt[FmtLetStmtExts]
alias FmtForStmt = ForStmt[FmtForStmtExts]
alias FmtStmt = Stmt[FmtLetStmtExts, FmtForStmtExts]
...</code></pre>
<p>And then with a feature similar to type families, we can have one type for each use site (type checker, formatter, …) and map that one type to extension types for each of the rows and reduce number of type parameters. (there will always be at least one type parameter in extended types)</p>
<h1 id="final-thoughts">Final thoughts</h1>
<p>I’m not aware of any other languages that apply row extensions to named types, which is the reason why I wanted to write this post.</p>
<p>The main challenge for this feature to be useful is the <code>deriving</code> support. The macros will have to run during monomorphisation to make use of the extra fields and constructors. The generated code will then be type checked in a different language (monomorphic AST instead of the front-end AST), which can lead to things like: code that normally doesn’t type check, but does type check when generated in a macro, as macro expansion is type checked differently. While I can’t imagine how this could happen today, that doesn’t mean it won’t, and it’s best if we just don’t open the door to this kind of thing.</p>
<section class="footnotes" role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><p>See also <a href="https://osa1.net/posts/2020-02-21-knot-tying-why-how-opinions.html">my blog post from 2020</a> that touches some of the same points.<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn2" role="doc-endnote"><p>Variants are value (unboxed) types, so they’re not heap allocated, and refinement just moves fields around. In general, pattern matching should never allocate, and this currently holds in Fir.<a href="#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn3" role="doc-endnote"><p>In short, all anonymous types are values in Fir. For named types the user decides whether to box or not.<a href="#fnref3" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn4" role="doc-endnote"><p>This only works in a prototype that currently lives in the <code>extensible_named_types</code> branch. Online interpreter does not have this feature yet.<a href="#fnref4" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn5" role="doc-endnote"><p>I know I failed to articulate it <a href="https://osa1.net/posts/2024-10-09-oop-good.html">at the time</a>, but I think polymorphism without requiring type parameters is the main advantage of subtyping compared to parametric polymorphism, and I think it’s the killer feature of OOP (as I define in the post). I want to get back to this point in a later blog post.<a href="#fnref5" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn6" role="doc-endnote"><p>We already support <code>#[derive(...)]</code>s today, but they’re a part of the self-hosted compiler (not libraries), and I’m not sure if we want to keep them or do it another way. I needed to derive implementations quickly and didn’t have time to consider alternatives too much.<a href="#fnref6" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn7" role="doc-endnote"><p>Yes, I also had to add row extensions to function types for this.<a href="#fnref7" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn8" role="doc-endnote"><p>I think duplicating fields should be OK.<a href="#fnref8" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn9" role="doc-endnote"><p>We don’t want to eagerly expand type synonyms to their RHSs because then error messages refer to the RHSs rather than synonyms, and keeping type synonyms around as we type check means we have to remember to look through them in many places. It’s a minor thing but considering how useful they are (very little, at least until this feature) it just seemed like they’re not worth it.<a href="#fnref9" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>]]></summary>
</entry>
<entry>
    <title>How Fir formats comments</title>
    <link href="http://osa1.net/posts/2025-09-27-fir-formatter.html" />
    <id>http://osa1.net/posts/2025-09-27-fir-formatter.html</id>
    <published>2025-09-27T00:00:00Z</published>
    <updated>2025-09-27T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p><a href="https://github.com/fir-lang/fir">Fir</a> formats comments by assigning comment tokens to non-comment tokens (only conceptually, not in the implementation, see below), and generating comments when formatting the tokens that “own” them.</p>
<p>This keeps AST nodes small. The parser doesn’t know about comments at all, and code that doesn’t care about comments don’t allocate more or run more code for comments.</p>
<hr />
<p>Formatting source code with comments is tricky, and common suggestions like adding comments to AST nodes or generating lossless (or concrete) syntax trees (CSTs) are not feasible in real programming languages. Consider this simple Fir function:</p>
<pre><code>add(x: U32, y: U32) U32:
    ...</code></pre>
<p>This simple function, without the body, has 14 places where a comment can appear:</p>
<pre><code>#|1|#
#|2|# add #|3|# (
    #|4|# x #|5|# : #|6|# U32 #|7|# ,
    #|8|# y #|9|# : #|10|# U32 #|11|#
) #|12|# U32 #|13|# : #|14|#
    ...</code></pre>
<p>If I were to add comment tokens to AST nodes, about 6 of these would belong to the “function declaration” AST node:</p>
<pre><code>#|1|#
#|2|# add #|3|# ( #|4|# ... ) #|12|# ... : #|14|#
    ...</code></pre>
<p>Because each of these is in different positions in the declaration, they would need different fields in the AST node.</p>
<p>If you consider that a real programming language will have hundreds of different types of expression, statement, declaration, … nodes, it becomes clear that this approach is simply not feasible.</p>
<p>The CST approach is not too different, it just moves the inconvenience from the tree type definitions and tree allocations to the use sites of the trees.</p>
<p>What Fir does is much simpler: it requires no support from the parse trees. The parser doesn’t even know about comments, and the AST users that don’t care about comments also don’t need to deal with them and don’t pay any price for them (runtime or memory).</p>
<p>Conceptually, we assign every comment token to a non-comment token. In the example above, comments 1, 2, and 3 belong to the identifier <code>add</code>. Comment 4 belongs to the token <code>(</code>, and so on.</p>
<p>When formatting, we don’t generate text directly. Instead we format the source code token by token. In the example above, we’re formatting a function definition, so we know that there will be a left paren after the function name. But we don’t generate a “(” directly after the function name. Instead we find the token for the left paren, and format it. This formatting operation also generates comments that belong to the left paren.</p>
<p><strong>Assigning comment tokens to non-comment tokens:</strong> Conceptually, every token owns:</p>
<ul>
<li><p>Comment tokens before them that are not on the same line with another non-comment token.</p></li>
<li><p>Comment tokens after them that are on the same line with the token.</p></li>
</ul>
<p>In the example above, 1 and 2 belong to the identifier <code>add</code> because of the first rule, and 3 also belongs to the identifier because of the second rule.</p>
<p>This only leaves the trailing comments at the end of a file “unowned”, which we handle separately as their own thing.</p>
<p><strong>Finding tokens of AST nodes:</strong> The formatter still operates on AST nodes and AST nodes typically don’t need any extra fields for their tokens.</p>
<p>Instead of adding tokens to AST nodes, we represent identifiers as their tokens. Because many AST nodes have identifiers, we can start with those tokens and scan backwards and forwards to find the other tokens of the AST node, with the comments that they own.</p>
<p>When an AST node doesn’t have any identifiers, or finding the tokens of the node from the identifiers is difficult, we add a field for its first (or last) token, and scan forwards (or backwards) from those tokens to find the other tokens.</p>
<p>For example, in Fir, as of today, type declarations are represented as this: (<a href="https://github.com/fir-lang/fir/blob/7732446fe42185778cf331350345b114087b01b9/Compiler/Ast.fir#L66-L83">source</a>)</p>
<pre><code>## A type declaration: `type Vec[t]: ...`.
type TypeDecl(
    ## When the type is a primitive, the `prim` token.
    prim_: Option[TokenIdx],

    ## The type name. `Vec` in the example.
    name: Id,

    ## Type parameters of the type. `[t]` in the example.
    typeParams: Vec[Id],

    ## Kinds of `type_params`. Filled in by kind inference.
    typeParamKinds: Vec[Kind],

    ## Constructors of the type.
    rhs: Option[TypeDeclRhs],
)</code></pre>
<p>Note that this node doesn’t have a token for the <code>type</code> keyword. Instead we start from <code>name</code> and scan backwards. The first non-trivia token that we see will be the <code>type</code> token. (<a href="https://github.com/fir-lang/fir/blob/7732446fe42185778cf331350345b114087b01b9/Tool/Format/Format.fir#L105-L106">source</a>)</p>
<p>(The <code>prim_</code> field could also be removed and we could scan backwards from the <code>type</code> token. If you’re interested in contributing, we have <a href="https://github.com/fir-lang/fir/issues/206">an issue</a> about cleaning up redundant token fields in AST nodes, which would be a good issue for getting started.)</p>
<p><strong>Generating comments with tokens:</strong> I used the word “conceptually” a few times above, because in the implementation we don’t really assign comment tokens to non-comment tokens.</p>
<p>Instead, the function that formats a token scans backwards and forwards to find comment tokens as described by the rules above, and generates them with the token.</p>
<hr />
<p>Scanning backwards and forwards to find other tokens and collecting comment tokens that belong to a token being formatted are quite simple. Here are the relevant code:</p>
<ul>
<li><p><a href="https://github.com/fir-lang/fir/blob/7732446fe42185778cf331350345b114087b01b9/Tool/Format/Format.fir#L1520-L1583"><code>formatToken</code></a> takes a non-comment token to be formatted and formats the token with the comments that belong to the token.</p></li>
<li><p><code>formatToken</code> calls <a href="https://github.com/fir-lang/fir/blob/7732446fe42185778cf331350345b114087b01b9/Tool/Format/Format.fir#L1703-L1723"><code>findCommentBefore</code></a> to find the first comment before it that needs to be formatted with it.</p>
<p>Finding the comments after it is easier, so it’s done in <code>formatToken</code> directly.</p></li>
<li><p><a href="https://github.com/fir-lang/fir/blob/7732446fe42185778cf331350345b114087b01b9/Tool/Format/Format.fir#L1761-L1773"><code>nextNonTrivia</code></a> and <a href="https://github.com/fir-lang/fir/blob/7732446fe42185778cf331350345b114087b01b9/Tool/Format/Format.fir#L1776-L1788"><code>prevNonTrivia</code></a> scan forwards and backwards from a given token to find the tokens of an AST node, as mentioned in the type declaration example above.</p></li>
<li><p>The trailing comments at the end of the file are not owned by any token, so they’re not formatted by default. Instead they’re <a href="https://github.com/fir-lang/fir/blob/7732446fe42185778cf331350345b114087b01b9/Tool/Format/Format.fir#L71-L81">handled specially</a> by the module formatter.</p></li>
</ul>
<p>Not adding tokens to the AST nodes keeps the AST nodes small (cheaper to allocate), and parser and user code simple. Use sites that don’t care about comment nodes pay no price for larger AST nodes or extra parsing code handling comments.</p>
<p>(There are a few open issues about Fir’s formatter, but none that are caused by the ideas explained in this post.)</p>]]></summary>
</entry>
<entry>
    <title>Fir is getting useful</title>
    <link href="http://osa1.net/posts/2025-09-04-fir-getting-useful.html" />
    <id>http://osa1.net/posts/2025-09-04-fir-getting-useful.html</id>
    <published>2025-09-04T00:00:00Z</published>
    <updated>2025-09-04T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>A few months ago I implemented a <a href="https://github.com/fir-lang/fir/blob/55bf6bacf31d04f5a6b623aedbede5a02bcd31a8/tools/peg/Peg.fir">PEG parser generator</a> in Fir. It <a href="https://github.com/fir-lang/fir/blob/55bf6bacf31d04f5a6b623aedbede5a02bcd31a8/tools/peg/PegGrammar.peg">parses its own grammar</a> and it’s also used to <a href="https://github.com/fir-lang/fir/blob/55bf6bacf31d04f5a6b623aedbede5a02bcd31a8/compiler/Grammar.peg">parse Fir</a>.</p>
<p>This week I finished another sizable<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a> Fir project: a <a href="https://github.com/fir-lang/fir/blob/55bf6bacf31d04f5a6b623aedbede5a02bcd31a8/tools/format/Format.fir">code formatter for Fir</a>. It now <a href="https://github.com/fir-lang/fir/commit/222940029c1cc71da2cd35d4f3c90eab885c918e">formats most of the Fir code</a> in the repo<a href="#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a>.</p>
<p>Fir is being designed and implemented from day one with tooling, libraries, and backwards compatibility in mind. The compiler’s front-end is currently being reused by the formatter. Soon it’ll be reused by a syntax-aware search-and-replace tool (similar to <a href="https://github.com/osa1/sg">sg</a>), and by a tool that combines Fir packages into a single .fir file (for sharing repros and automated repro reduction), and much later, by the language server and other tools. You can see the list of tools I want to implement <a href="https://github.com/fir-lang/fir/issues/28">here</a>.</p>
<p>By implementing the tooling along with the first version of the compiler (all in Fir), I want to make sure we have the right SDK design to support all these tools, and more. I want to publish the Fir front-end as a reusable package. This front-end should support the last N<a href="#fn3" class="footnote-ref" id="fnref3" role="doc-noteref"><sup>3</sup></a> releases of Fir, so that you can parse (and analyze, modify, refactor, migrate, …) the last N versions of Fir with the latest version of Fir.</p>
<p>I still haven’t written a post explaining what kind of language I want Fir to be, because that’s still largely an open question. However there are a few things that are decided: a compiled, typed language with ADTs, with typeclasses (called traits) for compile-time polymorphism (monomorphised, with value types), and <a href="https://osa1.net/posts/2025-06-28-why-effects.html">effects</a>. I want Fir to be a high-level, but still efficient, language.</p>
<p>Even implementing just a compiler is a big task, and designing and implementing a whole language with all these tools can’t be done by one person. If this vision sounds interesting to you, and you clicked on a few links above and like what you see, please don’t hesitate to reach out. Each of these tools comes with their own issues and tasks, so it’s now a good time to start contributing to Fir. I already have a list of <a href="https://github.com/fir-lang/fir/issues?q=is%3Aissue%20state%3Aopen%20label%3Apeg">issues for the PEG generator</a> and the <a href="https://github.com/fir-lang/fir/issues?q=is%3Aissue%20state%3Aopen%20label%3Aformatter">formatter</a>. There’s also all kinds of other things in the issue tracker. Depending on your experience, you can also keep yourself entertained in other ways: the interpreter is slow (a simple AST walker), the interpreter’s type checker is not in good shape etc. If you have the experience and opinions, you can also influence the language design.</p>
<p>My next task is, I’ll be implementing the search-and-replace tool mentioned above (I do this now mainly because I need it when working on Fir), and in parallel, <a href="https://github.com/fir-lang/fir/issues/195">designing and implementing the module system</a>. The module system will need to be implemented in the interpreter too, because I’ll be using modules in the compiler and other tools. Depending on how much free time I’ll have, it should be at least a month of work.</p>
<p>I’m happy with how it’s coming along and I’m excited about Fir’s future.</p>
<section class="footnotes" role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><p>Formatter is currently 1,086 loc. PEG is 850 loc without the parser for parsing itself. Generated Fir for the parsing PEGs is 2,364 loc, generated from 178 loc PEG.</p>
<p>It’s a bit more difficult to precisely measure the compiler’s grammar size, because it includes semantic actions, but the grammar is 888 loc and generated parser for the grammar is 5,147 loc.</p>
<p>In total (including tests), we have 21,012 loc Fir today in the repo.</p>
<p>All numbers excluding comments and whitespace.<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn2" role="doc-endnote"><p>We don’t format tests to avoid accidentally parsing only formatted code.<a href="#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn3" role="doc-endnote"><p>I’m not sure what the exact number here should be yet.<a href="#fnref3" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>]]></summary>
</entry>
<entry>
    <title>Why I'm excited about effect systems</title>
    <link href="http://osa1.net/posts/2025-06-28-why-effects.html" />
    <id>http://osa1.net/posts/2025-06-28-why-effects.html</id>
    <published>2025-06-28T00:00:00Z</published>
    <updated>2025-06-28T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>Imagine a programming language where you can have full control over whether and how functions, modules, or libraries interact with shared resources like the scheduler for threading, the file system and other OS-level resources like sockets and other file descriptors, timers for things like delaying the current thread for timed updates or scheduling timed callbacks, and so on.</p>
<p>In this language, a function (or module, library, …) needs to declare its interactions with the shared resources in its type.</p>
<p>When a function accesses e.g. the file system, the caller has full control over how it accesses the file system. All file system access functions can be specified (or overridden if they have a default) by the caller.</p>
<p>Furthermore, assume that this language can also suspend functions and resume them later, similar to <code>async</code> functions in many languages today, which are paused and resumed later when the value of e.g. a <code>Future</code> becomes available.</p>
<p>This language lends itself to a more composable system compared to anything that we have today. This system is composable, flexible, and testable by default.</p>
<p>If you think about it, it’s really strange that today we find it acceptable that I can import a library, and the library can spawn threads, use the file system, block the current thread with things like <code>sleep</code> or with blocking IO operations, and I have no control over it.</p>
<p>Most of the time, this kind of thing will be at least documented, but if I use a library that fundamentally needs these things, unless the library accounts for my use case, I may not be able to use it in my application.</p>
<p>For example, maybe it spawns threads but I want it to use my own thread pool where in addition to limiting number of threads, I attach priorities to threads and schedule based on priorities.</p>
<p>Or, maybe I have a library that builds/compiles things by reading files, processing them, and generating files. If I have control over the file system API that the library uses, it takes no effort (e.g. no planning ahead of time) to test this library using an in-memory file system, in parallel, without worrying about races and IO bottlenecks. I don’t have to consider testing scenarios in the library and structure my code accordingly.</p>
<p>Or, maybe I have code that polls some resources, and maybe posts periodic updates. It creates a thread that does the periodic work, and <code>sleep</code>s. With control over threads, schedulers, and timers, I can fast-forward in time (to the next event) in my tests without actually waiting for <code>sleep</code>s and any other timed events, to test my code quickly.</p>
<p>These are some of the things I get to do with an effect system.</p>
<h2 id="whats-in-an-effect-system">What’s in an effect system?</h2>
<p>At a high-level, an effect system has two components: (1) a type system, and (2) runtime features.</p>
<p>These two components are somewhat orthogonal: you can have one without the other, depending on what you want to make possible.</p>
<p>In the systems available today, (1) typically involves adding a type component to function types, for the effects a function can invoke.<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a></p>
<p>For example, in <a href="https://koka-lang.github.io/">Koka</a>, if you define stdin/stdout operations in an effect named <code>console</code>, and have a function that uses the <code>console</code> effects, the function’s type signature looks like this:</p>
<pre><code>fun sayHi() -&gt; console ()
  print(&quot;hi&quot;)</code></pre>
<p>This type says <code>sayHi</code> returns unit (<code>()</code>) and uses the <code>console</code> effect.</p>
<p>(2) typically involves capturing the continuation of the effect invocation and passing it to a “handler”. Depending on the system, the handler can then do things (e.g. memory operations, invoking other effects) and “jump” to (or “tail call”) the continuation with the value returned by the invoked effect.</p>
<p>With the <code>console</code> effect above, a handler may just record the printed string in a data structure, which can then be used for testing. Another handler may actually write to <code>stdout</code>, which would then be used when you run the application.</p>
<p>Depending on the exact (1) and (2) features, you get to do different things. The current effect systems in various languages support different (1) and (2) features, and there are some systems that omit one of (1) or (2) entirely.</p>
<p>For the purposes of this blog post, we won’t consider the full spectrum of features you can have, and what those features allow.</p>
<h2 id="example-a-simple-grep-implementation-in-koka">Example: a simple grep implementation in Koka</h2>
<p>There isn’t a language today that gives us everything we need for the use cases I describe at the beginning.</p>
<p>However among the languages that we have, Koka comes close, so we’ll use Koka for a simple example.</p>
<p>Imagine a simple “grep” command that takes a string and a list of file paths as arguments, and finds occurrences of the string in the file contents and reports them.</p>
<p>In Koka, the standard library definitions for these “effects” could look like this:</p>
<pre><code>effect fs
  ctl read-file(path: path): string

effect console
  ctl println(s: string): ()</code></pre>
<p>Using these effects, the code that reads the files and searches for the string is not different from how it would look like in any other “functional”<a href="#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a> language:</p>
<pre><code>fun search(pattern: string, files: list&lt;string&gt;): &lt;fs, console&gt;()
  val pattern-size = pattern.count()
  files.foreach fn(file)
    val contents = read-file(file.path)
    val parts = contents.split(pattern)
    report-matches(file, pattern-size, parts)

fun report-matches(file: string, pattern-size: int, parts: list&lt;string&gt;): &lt;console&gt;()
  if parts.length == 0 then
    return ()

  println(file)

  var line := 0
  var column := 0
  parts.init.foreach fn(part)
    part.vector.foreach fn(char)
      if char == &#39;\n&#39; then
        line := line + 1
        column := 0
      else
        column := column + 1

    println((line + 1).show ++ &quot;:&quot; ++ (column + 1).show)</code></pre>
<p>When calling <code>search</code>, I have to provide handlers for <code>fs</code> and <code>console</code> effects.</p>
<p>In the executable that I generate for users, I can use handlers that do actual file system operations and print to <code>stdout</code>:</p>
<pre><code>val fs-io = handler
  ctl read-file(path: path)
    resume(read-text-file(path))

val console-terminal = handler
  ctl println(s: string)
    write-to-stdout(s)
    resume(())</code></pre>
<p>In the tests, I can use a <code>read-file</code> handler that reads from an in-memory map, and add printed lines to a list, to compare with the expected test outputs:</p>
<pre><code>struct test-case
  files: list&lt;test-file&gt;
  pattern: string
  expected-output: list&lt;string&gt;

struct test-file
  path: path
  contents: string

val test-cases: list&lt;test-case&gt; = [
  Test-case(
    files = [Test-file(&quot;file1&quot;.path, &quot;test\ntest&quot;), Test-file(&quot;file2&quot;.path, &quot;a\n test\nb&quot;)],
    pattern = &quot;test&quot;,
    expected-output = [&quot;file1&quot;, &quot;1:1&quot;, &quot;2:1&quot;, &quot;file2&quot;, &quot;2:2&quot;]
  ),
]

fun test(): &lt;exn&gt;()
  var printed-lines := Nil

  test-cases.foreach fn (test)
    with handler
      ctl read-file(path_: path)
        match test.files.find(fn (file) file.path.string == path_.string)
          Just(file) -&gt; resume(file.contents)
          Nothing -&gt; throw(&quot;file not found&quot;, ExnAssert)

    with handler
      ctl println(s: string)
        printed-lines := Cons(s, printed-lines)
        resume(())

    search(test.pattern, test.files.map(fn (file) file.path.string))

    if printed-lines.reverse != test.expected-output then
      throw(&quot;unexpected test output&quot;, ExnAssert)</code></pre>
<p>You can see the full example <a href="https://gist.github.com/osa1/a5e7fdfa30d69125970c0797c525ede2">here</a>.</p>
<h2 id="i-can-already-do-this-in-language-x-using-libraryframework-y">I can already do this in language X using library/framework Y?</h2>
<p>The point with effect systems is that, you don’t get a composable and testable system <em>when you design for it</em>, you get it <em>by default</em>.</p>
<p>If you implement a library that uses the file system, I can run it with an in-memory file system, or intercept file accesses to prevent certain things, or log certain things, and so on, regardless of whether you designed for it or not.</p>
<p>The Koka code above does not demonstrate this fully, and there’s no system available today that can. I’m just using whatever is available today.</p>
<p>In an ideal system, you would have to go out of your way to have access to the filesystem without using an effect, rather than the other way around.</p>
<p>When comparing languages we never talk about what’s possible: almost everything is possible in almost every general purpose programming language.</p>
<p>What we’re talking about is things like: the idiomatic and performant way of doing things.</p>
<p>The language where what I talk about is idiomatic and performant does not exist today.</p>
<h2 id="how-do-we-know-that-this-ideal-system-is-possible">How do we know that this ideal system is possible?</h2>
<p>We mentioned that the two components of an effect system are somewhat orthogonal. In the design that I have in mind (more on this below), without the type system part of it you still get 90% of the benefits. So let’s focus on the runtime parts.</p>
<p>What you need for a flexible effect system is, <em>conceptually</em>, a way of suspending the stack when calling an effect, passing the suspended stack (you may want to call it a “continuation”) to the handler for the effect invoked.</p>
<p>This kind of thing is already possible in many of the high-level languages today. If your language supports lightweight threads (green threads, fibers, etc.), coroutines, generators, or similar features where the code is suspended when it does something like <code>await</code> or <code>yield</code>, and then resumed later, you already have the runtime features for a flexible effect system.</p>
<h2 id="for-me-its-about-composable-and-testable-libraries">For me, it’s about composable and testable libraries</h2>
<p>I deliberately didn’t mention in this blog post so far that effect systems generalize features like async/await, iterators/generators, exceptions, and many other features.</p>
<p>The reason is because, as a user, I don’t care whether these features are implemented using an effect system under the hood, or in some other ways. For example, Dart has all of these features, but it doesn’t use an effect system to implement them. As a user, it doesn’t matter to me as long as I have the features.</p>
<p>Instead, what I’m more interested in as a user is: how it influences or affects library design, and what it allows me to do at a high level, in large code bases.</p>
<p>However it would be a shame to not mention that, yes, effect systems generalize all these features, and more. The paper <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/asynceffects-msr-tr-2017-21.pdf">“Structured Asynchrony with Algebraic Effects”</a> shows how these features can be implemented in Koka.</p>
<h2 id="to-be-continued">To be continued</h2>
<p>Some of the recent discussions online about effect systems left me somewhat dissatisfied, because most posts seem to focus on small-scale benefits of effect systems, and I wanted to share my incomplete (but hopefully not incoherent!) perspective on effect systems.</p>
<p>In the future posts I’m hoping to cover some of the open problems when designing such a system.</p>
<hr />
<p>Thanks to <a href="https://github.com/TimWhiting/">Tim Whiting</a> for reviewing a draft of this blog post.</p>
<section class="footnotes" role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><p>This is a somewhat rough estimate on what these effect types in function types indicate. In practice it’s more complicated than “effects the function invokes”: if you read it as that you fail to explain some of the type errors, or why some code of the code type checks. More on this (hopefully) in a future post.<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn2" role="doc-endnote"><p>“Functional” in quotes because I don’t think that word means much these days. Maybe more on this later.<a href="#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>]]></summary>
</entry>
<entry>
    <title>Changes to variants in Fir</title>
    <link href="http://osa1.net/posts/2025-06-12-fir-new-variants.html" />
    <id>http://osa1.net/posts/2025-06-12-fir-new-variants.html</id>
    <published>2025-06-12T00:00:00Z</published>
    <updated>2025-06-12T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>In the previous two posts (<a href="https://osa1.net/posts/2025-01-18-fir-error-handling.html">1</a>, <a href="https://osa1.net/posts/2025-04-17-throwing-iterators-fir.html">2</a>) we looked at how Fir utilizes variant types for exceptions tracked at function types, aka. checked exceptions.</p>
<p>As I wrote more and more Fir, it quickly became obvious that the current variant type design is just too verbose and difficult to use.</p>
<p>To see the problems, consider a JSON parsing library. This library may throw a parse error when the input is not valid. Before the recent changes, the parsing function would look like this:</p>
<pre><code>parse(input: Str) Json / [ParseError, ..exn]:
    ...
    # When things go wrong:
    throw(~ParseError)
    ...</code></pre>
<p>(As a reminder: <code>[ParseError, ..exn]</code> part is the variant type for the exceptions that this function throws. <code>ParseError</code> is a label for the exception value, and it has no fields. <code>..exn</code> part is the row extension, allowing this function to be called in functions that throw other exceptions.)</p>
<p>This error type is not that useful, because the label <code>ParseError</code> doesn’t contain any information like the error location.</p>
<p>When we start adding fields to it, things quickly get verbose:</p>
<pre><code>parse(input: Str) Json / [ParseError(errorByteIdx: U32, msg: Str), ..exn]:
    ...
    # When things go wrong:
    throw(~ParseError(
        errorByteIdx = ...,
        msg = ...,
    ))
    ...</code></pre>
<p>Now every function that propagates this error needs to include the same fields in the label.</p>
<p>As a second problem, suppose that there’s another library that parses YAML, which also throws an exception with the same label <code>ParseError</code>. Because we can’t have the same label multiple times in a variant (as we would have no way of distinguishing them in pattern matching), we can’t call both library functions in the same function, doing that would result in a type error about duplicate labels with different fields.</p>
<p><em>For the verbosity of labels with fields:</em> we could have type synonyms for variant alternatives, but this doesn’t solve the problem with using the same labels in different libraries.</p>
<p><em>For the label conflicts:</em> we could manually make the labels unique, maybe by including library name in the label, like <code>JsonParseError(...)</code> and <code>YamlParseError(...)</code>.</p>
<p>This makes labels longer, and it doesn’t guarantee that conflicts won’t occur. For example, if we allow linking different versions of the same library in a program, two different versions of the library might have the same label <code>JsonParseError</code>, but with different fields.</p>
<p>A combination of more creative features may solve the problem completely, but features add complexity to the language, even when they work well together. If possible, it would be preferable to improve the utility of existing features instead.</p>
<p>As a solution that uses only existing features, Fir variants now hold named types. The example above now looks like this:</p>
<pre><code>type ParseError:
    errorByteIdx: U32
    msg: Str

parse(input: Str) Json / [ParseError, ..exn]:
    ...
    # When things go wrong:
    throw(~ParseError(
        errorByteIdx = ...,
        msg = ...,
    ))
    ...</code></pre>
<p>(A named type in Fir is anything other than a record or variant. See <a href="https://osa1.net/posts/2021-04-10-sums-and-products.html">this post</a> for more details on named and anonymous types.)</p>
<p>From the type checker’s point of view, a variant is still a map of labels to fields, but we now implicitly use the fully qualified names of types as the labels.</p>
<p>So the variant above looks like this to the type checker: <code>[Label("P.M.ParseError")(P.M.ParseError), ...exn]</code>, where <code>P</code> is the package name and <code>M</code> is the module path to the type <code>ParseError</code>, and <code>(...)</code> part after the label indicates a single positional field.</p>
<p>This solves all of the problems with labels, and has several of other advantages:</p>
<ul>
<li><p>Named types are concise as we don’t have to list all of the fields every time we mention them.</p></li>
<li><p>Named types and their fields can be documented.</p></li>
<li><p>Named types can have methods.</p></li>
<li><p>Named types can be extended with more fields without breaking backwards compatibility. So now it’s possible to add more fields to <code>ParseError</code> without breaking existing users.</p></li>
<li><p>A type with the same name defined in different packages or even modules can now be used in the same variant type.</p>
<p>(When showing a variant type to the user in an error message, we add package and module prefixes as necessary to disambiguate.)</p></li>
<li><p>If I import a named type <code>Foo</code> as <code>Bar</code> in a module, I can use <code>Bar</code> in my variant types and it would be seen as <code>Foo</code> elsewhere.</p></li>
<li><p>Named types can implement traits. This opens up possibilities for implicitly deriving traits for variant types.</p></li>
</ul>
<p>One implication of using the fully qualified path of a type as the label is that we don’t allow the same type constructor applied to different types in the same variant. E.g. <code>[Option[U32], Option[Bool]]</code> is not allowed.</p>
<p>This is the same limitation with duplicate labels in the original version, where <code>[Label1(x: U32), Label1(y: Str)]</code> wasn’t allowed. I don’t think this will be an issue in practice.</p>
<p>Pattern matching works as before, but we now omit the labels, as they’re inferred from the types of patterns. Here’s a contrived example demonstrating the syntax:</p>
<pre><code>f() / [Option[U32], ..exn]:
    throw(~Option.None)

g() / [Result[Str, Bool], ..exn]:
    throw(~Result.Ok(Bool.True))

main():
    match try({
        f()
        g()
    }):
        Result.Ok(()): print(&quot;OK&quot;)
        Result.Err(~Option.None): print(&quot;NA&quot;)
        Result.Err(~Result.Ok(bool)): print(&quot;Bool: `bool`&quot;)
        Result.Err(~Result.Err(str)): print(&quot;Str: `str`&quot;)</code></pre>
<p>This is essentially the same as before, just with variant labels omitted.</p>
<p>To keep things simple, I haven’t implemented supporting literals in variant syntax yet: <code>~123</code>, <code>~"Hi"</code>, or <code>~'a'</code> doesn’t work yet. It wouldn’t be too much work to implement this, but I don’t need it right now.</p>
<hr />
<p>In retrospect, using named types in variants is such an obvious improvement, with practically no downsides. But it took a few thousands of lines of Fir for me to realize this.</p>
<p>If I discover cases where explicit labels are useful, the current design is not incompatible with the old one. The type checker still uses the same variant representation, with a label and a field for each alternative (with multiple fields are represented as records). It shouldn’t be too difficult to support both named types and labels in variant types.</p>
<p>This new design improves error handling quite a bit, but there are still a few problems we need to solve. In a future post I’m hoping to talk about the issues with adding a type component to the function types for exceptions.</p>]]></summary>
</entry>
<entry>
    <title>Throwing iterators in Fir</title>
    <link href="http://osa1.net/posts/2025-04-17-throwing-iterators-fir.html" />
    <id>http://osa1.net/posts/2025-04-17-throwing-iterators-fir.html</id>
    <published>2025-04-17T00:00:00Z</published>
    <updated>2025-04-17T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>Recently I’ve been working on extending <a href="https://github.com/fir-lang/fir">Fir</a>’s <code>Iterator</code> trait to allow iterators to throw exceptions.</p>
<p>It took a few months of work, because we needed multiple parameter traits for it to work, which took <a href="https://github.com/fir-lang/fir/pull/73">a few months of hacking</a><a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a> to implement.</p>
<p>Then there was a lot of bug fixing and experimentation, but it finally works, and I’m excited to share what you can do with Fir iterators today.</p>
<p>As usual, link to the online interpreter with all of the code in this post is at the end.</p>
<p>Before starting, I recommend reading the <a href="https://osa1.net/posts/2025-01-18-fir-error-handling.html">previous post</a>. It’s quite short and it explains the basics of error handling in Fir.</p>
<p>Previous post did not talk about traits at all, so in short, traits in Fir is the same feature as Rust’s traits and Haskell’s typeclasses<a href="#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a>.</p>
<p>The <code>Iterator</code> trait in Fir is also the same as the trait with the same name in Rust, and it’s used the same way, in <code>for</code> loops.</p>
<p>Here’s a simple example of what you can do with iterators:</p>
<pre><code>sum(nums: Vec[U32]) U32:
    let result: U32 = 0
    for i: U32 in nums.iter():
        result += i
    result</code></pre>
<p>The <code>Vec.iter</code> method returns an iterator that returns the next element every time its <code>next</code> method is called. <code>for</code> loop implicitly calls the <code>next</code> method to get the next element, until the <code>next</code> method returns <code>Option.None</code>.</p>
<p>Similar to Rust’s <code>Iterator</code>, Fir’s <code>Iterator</code> trait also comes with a <code>map</code> method that allows mapping iterated elements:</p>
<pre><code>parseSum(nums: Vec[Str]) U32:
    let result: U32 = 0
    for i: U32 in nums.iter().map(parseU32):
        result += i
    result

parseU32(s: Str) U32:
    if s.len() == 0:
        panic(&quot;Empty input&quot;)

    let result: U32 = 0

    for c: Char in s.chars():
        if c &lt; &#39;0&#39; || c &gt; &#39;9&#39;:
            panic(&quot;Invalid digit&quot;)

        let digit = c.asU32() - &#39;0&#39;.asU32()

        result *= 10
        result += digit

    result</code></pre>
<p>This version takes a <code>Vec[Str]</code> as argument, and parses the elements as integers.</p>
<p>The problem with this version is that it panics on unexpected cases: invalid digits and empty input, and it ignores overflows.</p>
<p>Until now, there wasn’t a convenient way to use the <code>Iterator</code> API and <code>for</code> loops to do this kind of thing, while also propagating exceptions to the call site of the <code>for</code> loop, or to the loop variable. But now we can do this: (<code>parseU32Exn</code> is from the previous post)</p>
<pre><code>parseSum(nums: Vec[Str]) U32 / [Overflow, EmptyInput, InvalidDigit, ..errs]:
    let result: U32 = 0
    for i: U32 in nums.iter().map(parseU32Exn):
        result += i
    result</code></pre>
<p>Errors that <code>parseU32Exn</code> can throw are now implicitly thrown from the <code>for</code> loop and reflected in the function’s type.</p>
<p>This new <code>Iterator</code> API is flexible enough to allow handling some (or all) of the exceptions thrown by a previous iterator. For example, here’s how we can handle <code>InvalidDigit</code> exceptions and yield <code>0</code> instead:</p>
<pre><code>parseSumHandleInvalidDigits(nums: Vec[Str]) U32 / [Overflow, EmptyInput, ..errs]:
    let result: U32 = 0
    for i: U32 in nums.iter().map(parseU32Exn).mapResult(handleInvalidDigit):
        result += i
    result

handleInvalidDigit(
    parseResult: Result[[InvalidDigit, ..errs], Option[U32]]
) Option[U32] / [..errs]:
    match parseResult:
        Result.Ok(result): result
        Result.Err(~InvalidDigit): Option.Some(0u32)
        Result.Err(other): throw(other)</code></pre>
<p><code>InvalidDigit</code> is no longer in the exception type of the function because <code>mapResult(handleInvalidDigit)</code> handles them.</p>
<p>We can also convert exceptions thrown by an iterator to <code>Result</code> values:</p>
<pre><code>parseSumHandleInvalidDigitsLogRest(nums: Vec[Str]) U32:
    let result: U32 = 0
    for i: Result[[Overflow, EmptyInput], U32] in
            nums.iter().map(parseU32Exn).mapResult(handleInvalidDigit).try():
        match i:
            Result.Err(~Overflow): printStr(&quot;Overflow&quot;)
            Result.Err(~EmptyInput): printStr(&quot;Empty input&quot;)
            Result.Ok(i): result += i
    result</code></pre>
<p>This function no longer has an exception type, because exceptions thrown by the iterator are passed to the loop variable.</p>
<p>In summary, we started with an iterator that doesn’t throw (<code>nums.iter()</code>), mapped it with a function that throws (<code>map(parseU32Exn)</code>), which made the <code>for</code> loop propagate the exceptions thrown by the map function. We then handled one of the exceptions (<code>mapResult(handleInvalidDigit)</code>), and finally, we handled all of the exceptions and started passing a <code>Result</code> value to the loop variable (<code>try()</code>).</p>
<p>The function’s exception type was updated each time to reflect the exceptions thrown by the function.</p>
<p>Once we had multiple parameter traits (which are important even without exceptions, and something we were going to implement anyway), no language features were needed specifically for the throwing iterators API that composes. Changes in the <code>for</code> loop type checking were necessary to allow throwing iterators in <code>for</code> loops. Composing iterators like <code>iter().map(...).mapResult(...).try()</code> in the examples above did not require any changes to the trait system or exceptions.</p>
<p>This demonstrates that Fir traits and exceptions work nicely together.</p>
<p>You can try the code in this blog post <a href="https://fir-lang.github.io/?file=ThrowingIter.fir">in your browser</a>.</p>
<h1 id="im-looking-for-contributors">I’m looking for contributors</h1>
<p>I’m planning a blog post on my vision of Fir, why I think it matters, and a roadmap, but if you already like what you see, know a thing or two about implementing programming languages, and have the time to energy to contribute to a new language, please don’t hesitate to reach out!</p>
<section class="footnotes" role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><p>I started this work in one country, and when finished, I was living in another! This PR really felt like an eternity to finish.<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn2" role="doc-endnote"><p>Implementation-wise, it’s closer to Rust than Haskell as we monomorphise.<a href="#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>]]></summary>
</entry>
<entry>
    <title>Error handling in Fir</title>
    <link href="http://osa1.net/posts/2025-01-18-fir-error-handling.html" />
    <id>http://osa1.net/posts/2025-01-18-fir-error-handling.html</id>
    <published>2025-01-18T00:00:00Z</published>
    <updated>2025-01-18T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>A while ago I came up with an <a href="https://gist.github.com/osa1/38fd51abe5247462eddb7d014f320cd2">“error handling expressiveness benchmark”</a>, some common error handling cases that I want to support in <a href="https://github.com/fir-lang/fir">Fir</a>.</p>
<p>After 7 months of pondering and hacking, I think I designed a system that meets all of the requirements. Error handling in Fir is safe, expressive, and convenient to use.</p>
<p>Here are some examples of what we can do in Fir today:</p>
<p>(Don’t pay too much attention to type syntax for now. Fir is still a prototype, the syntax will be improved.)</p>
<p>When we have multiple ways to fail, we don’t have to introduce a sum type with all the possible ways that we can fail, we can use variants:</p>
<pre><code>parseU32(s: Str) Result[[InvalidDigit, Overflow, EmptyInput, ..r], U32]:
    if s.len() == 0:
        return Result.Err(~EmptyInput)

    let result: U32 = 0

    for c in s.chars():
        if c &lt; &#39;0&#39; || c &gt; &#39;9&#39;:
            return Result.Err(~InvalidDigit)

        let digit = c.asU32() - &#39;0&#39;.asU32()

        result = match checkedMul(result, 10):
            Option.None: return Result.Err(~Overflow)
            Option.Some(newResult): newResult

        result = match checkedAdd(result, digit):
            Option.None: return Result.Err(~Overflow)
            Option.Some(newResult): newResult

    Result.Ok(result)</code></pre>
<p>An advantage of variants is, in pattern matching, we “refine” types of binders to drop handled variants from the type. This allows handling some of the errors and returning the rest to the caller:</p>
<pre><code>defaultEmptyInput(res: Result[[EmptyInput, ..r], U32]) Result[[..r], U32]:
    match res:
        Result.Err(~EmptyInput): Result.Ok(0u32)
        Result.Err(other): Result.Err(other)
        Result.Ok(val): Result.Ok(val)</code></pre>
<p>Here <code>EmptyInput</code> is removed from the error value type in the return type. The caller does not need to handle <code>EmptyInput</code>.</p>
<p>(We don’t refine types of variants nested in other types for now, so the last two branches cannot be replaced with <code>other: other</code> for now.)</p>
<p>Another advantage is that they allow composing error returning functions that return different error types:</p>
<p>(Fir supports variant constructors with fields, but to keep things simple we don’t use them in this post.)</p>
<pre><code>readFile(s: Str) Result[[IoError, ..r], Str]:
    # We don&#39;t have the standard library support for file IO yet, just return
    # an error for now.
    Result.Err(~IoError)

parseU32FromFile(filePath: Str) Result[[InvalidDigit, Overflow, EmptyInput, IoError, ..r], U32]:
    let fileContents = match readFile(filePath):
        Result.Err(err): return Result.Err(err)
        Result.Ok(contents): contents

    parseU32(fileContents)</code></pre>
<p>In the early return I don’t have to manually convert <code>readFile</code>s error value to <code>parseU32</code>s error value to make the types align.</p>
<p>Variants work nicely with higher-order functions as well. Here’s a function that parses a vector of strings, returning any errors to the caller:</p>
<pre><code>parseWith(vec: Vec[Str], parseFn: Fn(Str) Result[errs, a]) Result[errs, Vec[a]]:
    let ret = Vec.withCapacity(vec.len())

    for s in vec.iter():
        match parseFn(s):
            Result.Err(err): return Result.Err(err)
            Result.Ok(val): ret.push(val)

    Result.Ok(ret)</code></pre>
<p>If I have a function argument that returns more errors than my callback, I can still call it without any adjustments:</p>
<pre><code>parseWith2(vec: Vec[Str], parseFn: Fn(Str) Result[[OtherError, ..r], a]) Result[[..r], Vec[a]]:
    let ret = Vec.withCapacity(vec.len())

    for s in vec.iter():
        match parseFn(s):
            Result.Err(~OtherError): continue
            Result.Err(err): return Result.Err(err)
            Result.Ok(val): ret.push(val)

    Result.Ok(ret)</code></pre>
<p><code>parseWith2(vec, parseU32)</code> type checks even though <code>parseU32</code> doesn’t return <code>OtherError</code>.</p>
<p>Similarly, if I have a function that handles more cases, I can pass it as a function that handles less:</p>
<pre><code>handleSomeErrs(error: [Overflow, OtherError]) U32: 0

parseWithErrorHandler(
        input: Str,
        handler: Fn([Overflow, ..r1]) U3
    ) Result[[InvalidDigit, EmptyInput, ..r2], U32]:
    match parseU32(input):
        Result.Err(~Overflow): Result.Ok(handler(~Overflow))
        Result.Err(other): Result.Err(other)
        Result.Ok(val): Result.Ok(val)</code></pre>
<p>Here I’m able to pass <code>handleSomeErrs</code> to <code>parseWithErrorHandler</code>, even though it handles more errors than what <code>parseWithErrorHandler</code> argument needs.</p>
<h1 id="variants-as-exceptions">Variants as exceptions</h1>
<p>When we use variants as exception values, we end up with a system that is</p>
<ul>
<li>Safe: All exceptions need to be handled before <code>main</code> returns.</li>
<li>Flexible: All of the flexibility of variants shown above apply to exceptions as well.</li>
<li>Convenient:
<ul>
<li>Error values are implicitly propagated to the caller when not handled.</li>
<li>When a library uses one way of error reporting (error values or exceptions) and you need the other, conversion is just a matter of calling one function.</li>
</ul></li>
</ul>
<p>At the core of exceptions in Fir are these three functions:</p>
<ul>
<li><p><code>throw</code>, which converts a variant into an exception:</p>
<pre><code>throw(exn: exn) a / exn</code></pre></li>
<li><p><code>try</code>, which converts exceptions into <code>Result.Err</code> values:</p>
<pre><code>try(cb: Fn() a / exn) Result[exn, a]</code></pre></li>
<li><p><code>untry</code>, which converts a <code>Result.Err</code> value into an exception:</p>
<pre><code>untry(res: Result[exn, a]) a / exn</code></pre></li>
</ul>
<p>Here are some of the code above, using exceptions instead of error values:</p>
<pre><code>parseU32Exn(s: Str) U32 / [InvalidDigit, Overflow, EmptyInput, ..r]:
    if s.len() == 0:
        throw(~EmptyInput)

    let result: U32 = 0

    for c in s.chars():
        if c &lt; &#39;0&#39; || c &gt; &#39;9&#39;:
            throw(~InvalidDigit)

        let digit = c.asU32() - &#39;0&#39;.asU32()

        result = match checkedMul(result, 10):
            Option.None: throw(~Overflow)
            Option.Some(newResult): newResult

        result = match checkedAdd(result, digit):
            Option.None: throw(~Overflow)
            Option.Some(newResult): newResult

    result

readFileExn(s: Str) Str / [IoError, ..r]:
    # We don&#39;t have the standard library support for file IO yet, just throw
    # an error for now.
    throw(~IoError)

parseU32FromFileExn(filePath: Str) U32 / [InvalidDigit, Overflow, EmptyInput, IoError, ..r]:
    parseU32Exn(readFileExn(filePath))

parseWithExn(vec: Vec[Str], parseFn: Fn(Str) a / exn) Vec[a] / exn:
    let ret = Vec.withCapacity(vec.len())
    for s in vec.iter():
        ret.push(parseFn(s))
    ret</code></pre>
<p>When a library provides one of these, it’s trivial to convert to the other:</p>
<pre><code>parseU32UsingExnVersion(s: Str) Result[[InvalidDigit, Overflow, EmptyInput, ..r], U32]:
    try(||: parseU32Exn(s))

parseU32UsingResultVersion(s: Str) U32 / [InvalidDigit, Overflow, EmptyInput, ..r]:
    untry(parseU32(s))</code></pre>
<p>Nice!</p>
<hr />
<p>I’m quite excited about these results. There’s still so much to do, but I think it’s clear that this way of error handling has a lot of potential.</p>
<p>I’ll be working on some of the improvements I mentioned above (and I have others planned as well), and the usual stuff that every language needs (standard library, tools etc.). Depending on interest, I may also write more about variants, error handling, or anything else related to Fir.</p>
<p>You can try Fir online <a href="https://fir-lang.github.io/?file=ErrorHandling.fir">here</a>.</p>]]></summary>
</entry>

</feed>
