[tlaplus] Experience report: unicode-first spec writing

I am currently on a paid contract to write a TLA+ spec. I wanted to make it look professional, so decided to use unicode symbols as much as possible in the presentation. In fact, I decided to make the unicode version the source of truth, converting it to ASCII as needed so the java tools can handle it. I am using the Neovim editor. There are three additional tools I use in my workflow (shameless plug, I primarily wrote all three of them):

The tree-sitter-tlaplus grammar, which parses all the unicode symbols and provides good syntax highlighting; this has first-class support in Neovim via the nvim-treesitter plugin, and is also used to highlight TLA+ specs by github itself
TLAUC, a rust command line tool that translates specs from unicode to ASCII and back while maintaining conjunction & disjunction list alignment
A TLA+ plugin for Neovim, consisting of a simple Lua script auto-replacing the ASCII symbols with their unicode equivalents as you type. I made an asciinema demo of this that you can watch here.

So far things have been working quite well. It really is a pleasure to see all those beautiful math symbols, and it certainly enhances spec readability!

I would say the main drawback is you really need to know your TLA+ syntax. The tree-sitter grammar is not good at reporting syntax errors (it wasn't really made for that). This puts you in a bind, because in order to translate your unicode spec to ASCII so SANY can check it for errors, you already have to have the spec in a parse-able state. This is because TLAUC uses the tree-sitter grammar under the hood and will refuse to translate the spec if there are errors unless you use the --force parameter for a best-effort translation. TLAUC will also only be able to tell you the rough line numbers that errors might occur on, no other details.

One unforeseen quasi-feature is that since the tree-sitter grammar is also used for highlighting, you can kind of catch syntax errors as you type by noticing that the highlighting is a bit weird in some areas. Certainly much less helpful than a language server, but the experience is otherwise quite good!

Hope this was helpful and others try exercising these tools, submitting bugs feature requests etc. You can see the unicode equivalents to all TLA+ symbols here.

Andrew Helwer