osa1

Idea: a more structural code editor

November 2, 2024 - Tagged as: en.

Code is tree structured, but manipulated as a sequence of characters.

Most language tools¹ need to convert these sequence of characters to the tree form as the first thing to be able to do anything.

When the program is being edited, the tree structure is often broken, and often to the point where the tool cannot operate.

For example:

An opening parenthesis, brace, or bracket, without a matching closing one
An unterminated string literal or multi-line comment
A keyword inserted at a wrong place, or without the right tokens afterwards

These can make it impossible to main the tree structure of the code.

Since these cases are common, tools need to deal with these. A lot of time and effort is spent on error recovery so that when one of these common cases occur, the tool can still operate and do something useful.

For some tools handling these cases is a requirement: many of the language server functions need to work even when the code is being edited and not in a valid state. For example, “go to definition” should work, “outline” shouldn’t be reset every time the user inserts an opening brace, bracket, or parenthesis.

We can’t invent a new language to solve this problem: this creates a thousand new problems, each bigger than the one we are trying to solve. Designing and implementing a new language is major undertaking on its own. We can’t design and implement a language and an experimental code editor at the same time, and succeed in both.

So we want need to support existing languages, but existing languages are incredibly complex, sometimes with a hundred kinds of statements, expressions, types, and so on.

What I’d like to propose as a solution is a “mostly structural” editor, where programs are edited in a structural way at the highest levels, but as text at the statement and expression level.

The details depend on the language. As an example, let’s consider Rust. In Rust, packages (called “crates”), modules, and the items in modules (function and type definitions) can be defined structurally, because there aren’t a lot of different kinds of top-level declarations. Then in the function (and method) bodies, we write the code as text, as usual.

The advantages of this approach are:

We avoid inventing a new language. The idea can be applied to most languages.
Because we isolate invalid syntax to function bodies, no edit can cause syntax errors in the other functions in the same module, or in the other modules and packages.
Because we define function and method signatures separately from function/method bodies, syntax errors cannot invalidate types and cannot generate type errors outside of the function being edited.
For the same reason as above, “outline” view in the IDE is never broken. Functions like “go to definition” and “find references” always work.

As for the GUI part, I imagine an editor “pane” for each function being edited. I should be able to quickly switch between functions (maybe with a fuzzy search similar to ctrl-p in some editors), and when working on a function I should be able to quickly open documentation or definitions of the symbols used in the function, in new panes. I imagine there will be a lot of panes open at any time. This may require a solution like a tiling window manager to quickly arrange them and switch between them.

This problem is not new, I do a lot of buffer/split management every day while coding, and almost never use just a single editor window. However with each pane editing just one function, there will be a lot of splits and panes. Some creativity will be needed here to make managing these panes easy for the users.

I’m not aware of any language tool that doesn’t need to parse the source. Please let me know if you know such tools.↩︎

(Show comments)