Re: [tlaplus] TLC simulation mode

On Tue, Apr 14, 2026 at 6:21 AM A. Jesse Jiryu Davis <jesse@xxxxxxxxxxxxxxx> wrote:

Thanks Igor, I missed that sentence in your post: "In the experiments, I am using a custom framework to represent the above specifications-as-code that makes it easy to experiment with different search procedures."

Ovidiu, I've experimented with full state-space dumps. You can do it with the "dot" format for GraphViz and parse that, but it's a hack. Will Schultz's fork of TLC can dump the full state space as JSON. Install it with this script and run it like:

java -jar dist/tla2tools.jar YourSpec.tla -dump json YourSpec

Outputs:
- YourSpec-states.json --- one entry per reachable state with fingerprint fp and variable values val.
- YourSpec-edges.json --- one entry per transition with from, to, and action metadata.

It also outputs an empty JSON file YourSpec.json, I guess that's a bug in Will's code. There's a discussion about officially adding this feature to TLC.

On Tue, Apr 14, 2026 at 8:08 AM Ovidiu Marcu <ovidiu21marcu@xxxxxxxxx> wrote:
Thanks for these highlights.
Would it be possible to pre-generate all states in memory/on disk (maybe get a count) before running any model checking or simulation?
For a few trillion states I would be motivated to explore those alternatives.

Ovidiu

On Tue, Apr 14, 2026 at 12:28 AM Igor Konnov <igor.konnov@xxxxxxxxx> wrote:
Thanks Jesse!

Just to clarify for the future readers, if they look at the figures
without going through the rest of the text: The figures in the
blogpost are not given for the TLC simulations. The experiments are
closer to the random simulator of Quint. It would be interesting to
see how TLC's random simulation behaves w.r.t. state coverage on these
standard benchmarks.

Q-learning is an interesting direction, for sure. The main issue with
it is that it optimizes towards a single direction (dynamically
computed via the rewards). This is somewhat in conflict with
non-determinism in concurrent and distributed systems.

Cheers,
Igor

On Tue, Apr 14, 2026 at 3:53 AM A. Jesse Jiryu Davis
<jesse@xxxxxxxxxxxxxxx> wrote:
>
> Hello. Igor Konnov wrote a terrific post. He shows that running TLC in simulation mode for a long time doesn't find bugs reliably, it's no substitute for exhaustive model-checking. Markus wrote a very interesting comment. I'm bringing the discussion over here to the mailing list where maybe more TLA+ users will see it.
>
> 1. Markus wrote,
>
> enumerating all successors is useful for more than just choosing the next step: TLC can also check invariants on all generated successor states, not only on the one that ends up being sampled. That is a meaningful benefit when the goal is to catch bugs, not just drive a walk.
>
>
> That's interesting. I verified with Claude and the source code and confirmed: at each step of the simulation, TLC picks a random next action, then generates all the successor states that action can generate. It checks invariants for all those, then picks one of them at random as the next step. So TLC checks invariants on a "wide path" through the state space.
>
> 2. He also wrote:
>
> for specs that explicitly model faults, it’s not too hard to correct for “faulty transitions dominate the search” by adjusting the probabilities of the fault actions themselves. That’s the approach we’ve used in practice; SIMccfraft.tla is one example (https://github.com/microsoft/CCF/blob/main/tla/consensus/SIMccfraft.tla). Recent TLC versions also have extended simulator coverage statistics (-Dtlc2.tool.Simulator.extendedStatistics=true), which make this kind of tuning easier to reason about.
>
> It’s also worth mentioning that TLC supports reinforcement learning / Q-learning in addition to random walk. In our experience, though, getting Q-learning to work well is not especially straightforward, because picking the right variables for the hash/input space is non-trivial. Compared to that, manually adjusting action probabilities is much easier to understand and explain. We’ve also found Q-learning’s wall-clock performance to be significantly worse than random simulation with hand-tuned action probabilities.
>
>
> IIUC, a human spec author can tune the action probabilities to drive the system into interesting states more often (see the file he linked for example).
>
> It's also possible to use Q-learning: you can define a "reward" function that says how interesting a state transition is, and TLC will learn to favor the states or actions that led to that state transition. But you can only configure TLC to favor states that led to high-reward transitions (which is probably too fine-grained) or to favor actions that led to them (probably too coarse). Those are the only two options. Q-learning seems to be an undocumented experimental feature.
>
> Markus LMK what I got wrong. =)
>
> --
> You received this message because you are subscribed to the Google Groups "tlaplus" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
> To view this discussion visit https://groups.google.com/d/msgid/tlaplus/CAFRUCtYyeVuGNFg9iKBehN-x2dzB50govFYvdpqASi-PcDwu1A%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion visit https://groups.google.com/d/msgid/tlaplus/CACooHMC0H3SYnosd8Qq7eo2e0mezd04H2u0EbjZ8XfipfbBHSg%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion visit https://groups.google.com/d/msgid/tlaplus/CAGxa7HV2ETVhrUVViY3fGZ9ROc1r56Y1AnPOMAda4wBmvrWSRQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion visit https://groups.google.com/d/msgid/tlaplus/CAFRUCtbDk-gU%2B76KWWRa-b0tFa4f%3DdH5gcSB3zF_qNY8W5Xzpw%40mail.gmail.com.