osa1

Macros in Fir

May 12, 2026 - Tagged as: en, plt, fir.

Fir macros are fully deterministic programs that are distributed separately and that can introspect into the using program’s type-checked AST definitions.

Deterministic execution of macros is necessary to be able to

Cache macro results in the language server and between compilation of a package during development.
Avoid the issues with build scripts in many languages that can run arbitrary code when compiling a package, causing all kinds of security nightmares.
Potentially distribute packages with macros fully expanded. This makes dependency trees smaller as importing a package doesn’t bring in its macro dependencies.
Avoid leaking compiling platform details into generated code, which makes cross-compilation difficult.

Separate distribution is not a requirement but it simplifies the implementation quite significantly, and it’s also a good idea from a software design point of view:

With macros in the same package with the use sites, you have to compile some parts of the package with the macros to be able to run the macros or have an AST interpreter (or something in between: compile some parts to a bytecode), then run the macros, then compile the whole package again with the macro-generated code.

This is technically challenging for a few reasons:
- You have to partition the package into smaller parts that can be executed on their own.
- The execution will be slow (AST interpretation), or complicated (package will be compiled several times, with different entry points)
It also makes it easier to invalidate macro inputs (as they’ll depend on code in the same library that they’re used in), which will cause macros to run repeatedly during development.
Because of the determinism requirements, compilation of metaprograms won’t be the same as the compilation of executables. With metaprograms distributed separately this is easier to deal with.
From software design point of view, having metaprograms separate from the libraries using them is a good idea. Even without this restriction, I think most packages would have a separate Macros module with the metaprograms as it’d make it easier to navigate within the code base.

It should be rare for a macro and a library to recursively depend on each other. In these rare cases, the library can be split into two smaller libraries: one that the macro uses, another one that uses the macro (and maybe also the first library).

Introspection makes macros much more flexible and useful for many use cases. Without introspection macros take ASTs (or token trees, or any code as a string) as arguments and need to be passed the ASTs directly. E.g. in Rust, derive macros need to be added as an attribute to the definitions as they can’t otherwise go look up definition of a type that you pass to it as an identifier.

#[derive(MyDeriveMacro)]
type Foo { ... }

Here MyDeriveMacro is passed the token trees of the next item (the type definition). This has annoying limitations:

Calling the macro in another module, or even in another place in the same module is not possible.

If I generate e.g. an Hash trait implementation and some serialization code for the same type, all those need to be added as attributes to the type. This causes large number of attributes in types instead of core trait implementations (Hash) and others (serialization, debugging helpers, …) defined in separate modules.
This is not an essential limitation of this approach but: because of how limited the derive syntax is, passing other arguments to the macro (other than the following item’s token tree) is not possible. Users then work around this by littering the fields of the type with more attributes as the macro will be able to see those attributes, so they can be used to customize macro expansion.
Because there’s no type information passed to these macros, code generation based on types is not possible.
You can’t generate multiple impls for multiple types in one call as each call takes one type definition as input.

In Fir macros, you can pass explicitly (this part is important) as many type and function identifiers as you want and the macro gets the full type-checked ASTs of the definitions of those types and functions. These type-checked ASTs also allow looking up definitions used by those types and functions, so you also get the dependencies of those types and functions.

Short intro to the syntax before moving on to examples: $ indicates a macro call, @ is the syntax for passing a definition (rather than token tree) to a macro. E.g. @foo(@MyType, @myFunction, some [other, random = ("tokens")]) passes the full type-checked ASTs of MyType and myFunction to the macro, and an untyped token tree for the third argument.

A few examples of what introspection allows:

I want my ASTs to be serializable. The AST type is large and uses many types, and some of the dependencies may not even be in the current package. When I pass the AST type to a Fir macro like $deriveSerialize(@MyAstType), the macro sees all of the definitions of MyAstType, with their type-checked ASTs. So it can traverse all of the types used, and generate as many functions (or impls) as it needs.

Compare this with Rust where we’d need to add #[derive(Serialize)] or similar above each of the types used. If a type is not defined in our package, we have to work around it somehow (e.g. maybe introduce a newtype and use it in the AST).

Once we added those attributes to each and every type that’s used by the AST (directly and transitively), if a type becomes unused as we refactor our AST, we’ll get no warnings about the redundant attribute and we’ll have no way of finding it.
With introspection I can derive a trait for my type without having to derive the same trait for all of its dependencies. Imagine deriving Eq for a type like:
```
type Foo(
    x: Type1,
    y: Type2,
)
```
If all I have is the AST for Foo, all I can do in the == function is to compare x and y fields with ==, which requires x and y to implement Eq as well.

When I have access to Type1 and Type2 definitions, I can do the same, or I can just compare Type1 and Type2 fields directly, or generate comparison functions for those types and call them.

A real use case for this is when I have Eq that’s structural equality as usual, and then I want another comparison function that ignores certain fields. In language frontends, this commonly happens when you compare two ASTs ignoring source locations. In Fir, I can do
```
$deriveEq(@Foo, name = eqWithoutLocs, skip = [@Loc])
```
Here @Foo passes the full type-checked AST of Foo. The second argument is just a token tree that the macro parses for customization options. In this example, it passes the name of the function being generated. The third argument is similarly a token tree, but it has the Loc type definition in the AST. This allows the macro to generate structural equality code that skips all Loc-typed fields.

I can then have another one that generates the same code as a Rust derive macro would generate:
```
$deriveEq(@Foo)
```
This one doesn’t have any customization options, and the macro by default generates the usual trait impls.

Introspection opens up so many possibilities and solves many of the problems with purely syntactic macros (that take just a string, list of tokens, or ASTs as arguments).

Interaction with type checking

Fir has been designed from day 1 for parallel type checking and compilation.

With macros that can be passed type-checked ASTs, type checking gets interleaved with macro expansion, but module-level parallelism is not affected. This is because macros can’t generate imports and generated code (same as hand-written code) can’t access definitions that are not imported. (e.g. Fir doesn’t have paths like Rust’s crate::... or package::..., you can use names with qualified paths, but the paths still need to be imported explicitly first)

So we process the modules the same way as before: starting from the main module (or public modules in a library) we create a dependency DAG of modules¹. This DAG can then be processed in parallel as before. Macros have no influence over the DAGs of modules.

Within a module though, things get a bit tricky. A macro call can only be expanded after the definitions it introspects into are fully type checked, but it also needs to be expanded before too late to be able to type check definitions that depend on macro-generated code.

To deal with the first part of the problem (determining macro dependencies), we require that definitions are passed to macros explicitly (with the @<identifier> syntax we used above). If a definition is not explicitly passed to a macro and its not a dependency of a definition that’s explicitly passed to it, the macro won’t have access to it.

However, determining macro outputs ahead of time is not possible. So to deal with the second part of the problem, we create a dependency DAG of module-level items. Macros are also a part of these DAGs and their dependencies are determined by the @...s in their arguments. When there’s an unbound name in a definition, that name is potentially generated by the macros in the module that don’t depend on the definition, so that creates dependencies from the definition with unbound names to those macros (that don’t depend on the definition) in the module. Macros can’t be in a recursive dependency group (SCC) with other macros or definitions, so in the DAG we require that each macro is in its own group.

When we process this DAG of type checking and macro expansion operations in topological order we type check macro dependencies before macro expansion, and expand macros before any potential dependencies on their expansions.²

Macro call locations are not important for this algorithm, as the definitions are not processed in source code order. You can put a macro call anywhere in a module and it works the same way.

Macro generated code is name-resolved as usual, and the name resolving process updates the DAG with the dependencies of the generated code. Consider:

type Foo(...)

trait Trait1[t]:
    method(self: t)

$implTrait1(@Foo)   # generates `impl Trait1[Foo]: ...`

In this program the order of type checking operations are:

Foo is checked first
Then the macro is expanded
Name resolving the macro expansion creates new dependency edge from the generated code to Trait1

So the macro expands before Trait1, but the generated code is checked after Trait1.

There are a few ways to change the macro expansion schedules in the example above:

We can pass a reference to Trait1 to check Trait1 before the macro expansion: $implTrait1(@Foo, @Trait1)
We can generate impl methods instead of the whole impl. So instead of$implTrait1(@Foo) which generates an impl, we do
```
impl Trait1[Foo]:
    method(self: Foo):
        $genTrait1Method(@Foo)
```

Note that a macro expansion can update the DAG in arbitrary ways: new top-level definitions create nodes and references in the generated code create edges. They can also remove edges: remember that an unresolved name creates edges from the definition with the unresolved name to the macros in the module that doesn’t depend on the definition. Some of these names will be resolved as we expand macros, and the edges to other macro definitions from the definition with the unresolved name will be removed.

Implicit dependencies

Method calls introduce implicit dependencies: in x.f(arg1, arg2), there can be three functions that can be potentially called: (called candidates)

A top-level function f, with a first argument type that matches x’s. (UFCS)
A method f with a self type that matches x’s.
A trait method f whose all arguments match the types of arguments in the call site (x, arg1, arg2).

So a method call to f creates dependencies to all top-level functions and methods with name f, and all traits with a method f and the trait’s impls.

Interaction with the trait environment

Macros can generate traits and impls, but they don’t have access to the trait environment and can’t introspect into traits and impls:

The trait environment is per-module and it depends on the imports. For example, if I have a two-parameter trait MyTrait and implementation of MyTrait[Foo, Bar], the trait environment changes when I import the trait and the two types, even if I don’t use the trait (or its methods) explicitly. If we give macros access to the trait environment they could potentially generate different code based on imports, which goes against the principle we’ve been following with the explicit macro dependencies and deterministic outputs.
Impls are not named, so they can’t be explicitly passed.
Traits are named, so they can be explicitly passed, but it’s a bit unclear to me how useful that would be. I couldn’t come up with a use case where a macro would want to look into a trait definition and generate code based on that.

Deterministic execution of macros

This is not enforced in the current prototype, but it will be in the final version.

Once the effect system is ready, we can require that a function needs to have no effects to be usable as a macro.

However in any kind of statically checked system there will always be escape hatches (for the system to be practically useful), so just compile-time/type-level enforcement won’t be enough, and we’ll need to sandbox the macro programs regardless of how/what we check in compile time.

One easy way here would be compiling them to Wasm and then making host calls for IO (and other things we don’t allow in macros) fail. This is easy to implement but it requires a Wasm engine to be embedded within the language front-end, and execution will be slower than a native executable as (1) Wasm will need to be interpreted or JIT compiled (2) a native library could be loaded dynamically and it can share the same address space, so we can share immutable references to type-checked ASTs with the macros instead of serialization and deserialization for ASTs as they’re passed to macros and the generated ASTs are returned to the language front-end.

The details here are to be determined.

The macro API

In the prototype, macros are a part of the implementation and they use the internal data structures of the compiler.

One of the other goals with Fir since the early days is to have the language front-end available to users as libraries. To avoid creating yet another library/API when we already have the language front-end available, the macros will probably use the language’s official AST library.³

To avoid passing large ASTs to macros when a macro only needs the main type being passed (without the dependencies), we allow back-and-forth between a macro and the language front-end. A macro will be able to request ASTs of dependencies of the main AST being passed, it won’t get the whole thing in one call.

Macros will only have access to the definitions they’re explicitly passed (with the @<identifier> syntax) and won’t be provided anything else other than the passed definition and its dependencies, even if it so happens that at the time of expansion we type checked more. This is a part of the determinism requirements: given same inputs macros should always generate same outputs. Location of the macro call or type checker internals (or order) should not matter and should not change macro expansion.

Quotation in macros will be implemented using macros, e.g. when generating an expression instead of generating the ASTs manually:

Expr.BinOp(
    left = Expr.Var(...),
    op = Binop.Add,
    right = Expr.Call(...),
)

We implement (and distribute as a part of the language) quotation macros and instead have:

$expr(var + f(...))

$expr here is macro that parses its arguments (token trees) and converts them to Fir AST expressions.

(This is the same idea as Rust’s quote package.)

We can pass typed-checked ASTs and token trees, but not parsed ASTs (not type checked). I’m not sure how useful this would be, but if it becomes useful we can easily extend the system to allow passing parsed ASTs to macros. E.g. maybe we use @@expr[...] parsing an inlined expression and passing it to the macro as an expression AST.

In the meantime, macros can just parse the token trees as whatever they want using the language’s libraries. (instead of expecting parsed inputs)

Macro functions are ordinary Fir functions with a particular signature, but their signature allows passing different number of arguments with different types (token trees, type identifiers, function identifiers). The idea is that the same macro function can handle multiple call patterns, as in the deriveEq example above:

$deriveEq(@Foo) generates a top-level Eq trait implementation for the type Foo.
$deriveEq(@Foo, name = eqWithoutLocs, skip = [@Loc]) generates a function with name eqWithoutLocs and also comparison functions for the fields of Foo. The generated functions all skip the fields with type Loc.

The function signature for this macro looks something like:

deriveEq(inputs: Vec[TokenTree]) Ast: ...

Where TokenTree is a sum type with actual token trees but also type and function identifiers, and Ast is a sum type that has constructors for top-level items, expressions, statements, and anything else that we allow macros to generate.

The reasons for this design are:

Some of the macros (like deriveEq) will take a lot of customization parameters, and Fir doesn’t support optional arguments.
If we let the macro function specify the number of arguments passed, some of the macros will need bracketed arguments just to be able to pass variable number of things. E.g. if we have a regex macro that takes a number of regexes and generates a matcher function, it’d need to be called as $regex([re1, re2, ...]) instead of $regex(re1, re2, ...).

By allowing arbitrary sequence of (potentially comma separated) token tree in the argument lists we allow this flexibility and let the macro do sanity checking for the arguments.

Conditional compilation

As mentioned in the intro, it’s a deliberate design goal with macros that they’re fully deterministic and they generate the same code for the same inputs regardless of the compilation settings (host or target platforms, optimization parameters, etc.).

Conditional compilation in Fir will be done by dedicated language features (that don’t exist today). Macros will be able to generate code that use those conditional compilation features, but they won’t be doing conditional compilation themselves.

For example, if we have a syntax for checking target architecture pointer size, macros won’t be able to use it but they will be able to generate code that use the syntax for checking the target architecture pointer size.

This doesn’t complicate the macro system implementation any more than it already is: as mentioned, we need to sandbox macros anyway (or somehow make sure in compile time that they don’t have access to certain APIs). We just prevent access to conditional compilation features in similar ways.

Hygiene

Macro-generated code is name resolved and type checked in the using module’s environment, and so it can refer to names available at the macro call site.

To avoid issues when a call site imports e.g. the standard library with a prefix and the macro generates references to the standard library types, macros should generate fully qualified names. E.g. Fir/Vec/Vec[U32] instead of just Vec[U32]. However this is not enforced.

For the cases when a macro generates type or term ids that shouldn’t shadow definitions at the call site (either in the macro-generated code, or the code around the macro expansion), we provide a gensym function in the standard library. This function is only accessible by macros.

ASTs of types and terms passed to the macros (with the @<identifier> syntax) already have name-resolved ASTs, and using parts of those ASTs in the outputs generate qualified names that can’t be shadowed at the call sites. This is not done via a magic AST node that can only be created by the language front-end: identifiers in the ASTs can have qualifications or prefixes and that’s how macros should generate qualified names whenever possible. When we pass a @MyType to a macro and MyType uses Vec, the Vec references in the AST of MyType will have fully qualified path to the Vec. So copying that into the output also gives us a fully qualified Vec reference that we could also hand write.

Being able to write the fully qualified path Foo/Bar/Baz or macro-generate it doesn’t mean we can avoid importing Foo/Bar. It’s an explicit goal of Fir modules that the dependencies are always fully specified in the imports, and while we can access a definition in more than one way (with fully qualified paths, directly using the imported name, we can also import the same definition under different prefixes or with different names), there’s no way to access a definition without importing a module that exports it.

Macros don’t change this fact. They should always generate fully qualified paths to avoid shadowing and depending on modules being imported in a particular way, but the references in the generated code should still be explicitly imported by the calling module. This may mean that in some cases a call site of a macro may need to add imports that look unused, because the imported things are used in macro expansions.

The principle here is that macros can’t generate code that you can’t write by hand.

Final thoughts and current status

Unlike the other blog posts about Fir, features here are not fully implemented. The parts until the deterministic execution section above are currently implemented in a prototype and working.

The type checker requires quite a lot of refactoring for the proper implementation, which I’m slowly working on.

I think this is the final feature Fir needs to be considered a proper language, ready to tackle real problems. Once done, we’ll focus on bootstrapping the language.

Updates

15/05/2026: updated with details on checking the generated code and implicit dependencies
12/05/2026: post published

Fir allows recursive module imports, so it’d actually be more accurate to say “dependency DAG of SCCs of modules”. To keep things simple in this discussion we can assume modules can’t be recursive.↩︎
There’s an edge case here that we don’t deal with and let things fail to type check: when a macro generates e.g. foo but there’s also an imported foo, definitions in the module that use foo can use either the imported foo (if they’re scheduled before the macro expansion) or the macro-generated foo (if they’re scheduled after the macro expansion, because local definitions shadow imported ones). This case should be extremely rare and it’s not worth complicating the design or implementation more to deal with this.↩︎
The library will probably provide different ASTs for parsed and type checked programs, which is easy to do in Fir thanks to extensible named types.↩︎