Re: Analysis: Runway, a new formal specification system

Hello,

Andrew, your post is a nice comparison of TLA+ and Runway. I have explored both of the tools as well and I believe they each have positive aspects. To me, there are two important, independent parts that I felt strongly about: (1) the specification language itself and (2) the visualization tooling. In my personal opinion, the TLA+ language seems much clearer and understandable than the Runway language. Both seem to operate within the same basic conceptual model (defining states and actions on those states), but the core concepts seem more obfuscated and confusing in the Runway language. At many times, the syntax felt somewhat awkward, perhaps due to an apparent desire to present something closer to a standard programming language syntax. Admittedly, it took me a fair amount of work and study to fully understand the core concepts of TLA+, but once I did, I found the model to be extremely versatile and general, and writing specs in it became much more natural. Runway, on the other hand, felt more like an awkward programming language that provides a somewhat uncomfortable way to write specifications.

Similar to what Ron Pressler noted, I found the visualization tool provided by Runway to be very compelling, and I wonder if something similar could be integrated into the TLA+ tools. As I see it, there are two main approaches to visualization of specifications like those written in Runway/TLA+ (there may be others). The first, which the TLA+ toolbox seems to have added recently, is the "state diagram" visualization. That is, some kind of graph that represents states as nodes (usually boxes or circles), and actions/transitions as edges between them. This is useful for examining the large scale structure of an algorithm/specification, but in my opinion, it is less practical. For one, it becomes unwieldy when the state space is extremely large, and it is harder to understand small scale behaviors. The second mode of visualization, the approach taken by Runway, is more interactive in nature. It provides a way to interactively explore a single trace of the algorithm, by providing a way to "execute" the transitions that are enabled given a current state. I know that TLC provides a depth-first mode, but it could be interesting if it also provided some kind of "interactive" mode, that allowed a client to send commands to TLC to, for example, list the enabled actions given the current state, or perform a specific action to advance the current state. I could imagine an "interactive server" mode of TLC, which would allow external applications, like a web browser, to communicate with it in order to provide a visualization tool that users could interact with.

A little background on myself. I am an engineer at MongoDB, working on their database replication system, and I have been personally interested in applying more formal methods to our work. I became interested in TLA+ because I found it to be an elegant and powerful formalism, and I was searching for better and more precise ways to express system designs and algorithms. The model checker is an excellent component as well, that makes TLA+ more usable for real world problems. Visualization tools, in my opinion, are a great way of communicating ideas to others, and it is one weak spot of these tools that I would be interested in improving. I think that visualizations can act both to share knowledge, and also give more of an intuition about how something works. For example, it may take a considerable amount of effort and intellectual investment to thoroughly understand a non-trivial TLA+ specification by reading through it, but having an interactive way to explore the states and transitions could be valuable, in my opinion. Of course, the visualization would not be there to replace the specification, but would be a sort of view into it. I would be curious to hear other's thoughts on this. If it is considered a desirable enough feature, I would also be interested in contributing to its development.

Will

On Monday, July 4, 2016 at 5:32:13 PM UTC-4, Andrew Helwer wrote:

Objective

This post presents an analysis of the new Runway formal specification system as it relates to TLA+. The following questions will be answered:
In the areas where feature overlap exists, how does Runway compare to TLA+?
What additional features does Runway possess, and how/how well do they work?
What is the possible impact of Runway on formal specification in general and TLA+ in particular?
Introduction

Runway is a new formal specification system developed by Diego Ongaro, the primary author of the Raft paper. Runway comprises a formal specification language, model checker, simulator, and (most interestingly) visualizer. Ongaro introduced Runway in this blog post (TLA+ is mentioned at 25:30 in the embedded video); I recommend skimming it to get the full pitch. You can also see visualizer demos for a few specs here, including one for Raft. I was immediately interested in the simulator and visualizer features, as well as its perceived position as a more accessible formal specification language (see HN discussion). I wrote my own TLA+ elevator spec for in-depth language comparison with the existing Runway elevator spec, and implemented the Die-Hard problem in the Runway language as a crash course on taking a Runway spec from zero to running a visualizer. All of my source files live in this GitHub repo.

Feature Comparison

Below is the feature breakdown at time of writing. Note I tried to avoid features lacking only due to Runway's infancy - for example, distributed model-checking.

Features shared between TLA+ and Runway
Formal specification of safety properties
Exhaustive model-checking of safety properties
Randomized or "simulation mode" model-checking (note Runway has additional capabilities here)
More precise than plain-English specification
Features present only in TLA+
Formal specification of temporal properties (ex. if a person requests an elevator, they will eventually reach their desired floor if they act appropriately)
Exhaustive model-checking of temporal properties
Machine-checked proofs of correctness (TLAPS)
Pretty-printed specifications
Refinement
Features present only in Runway
REPL
In simulation mode, assign time values & probabilities to operations to see system behavior under real-world conditions
Create graphics & animations hooking into the simulation mode to visualize system function
Shared Feature: Specification Language

The largest area of feature overlap between Runway and TLA+ is the language, as used to formally specify safety properties. Fundamental language paradigms are identical; Runway defines next-state relations (called rules in parlance) to be explored by the model-checker while checking invariants. Differences include:
TLC accepts a subset of TLA+, whereas the Runway language has no constructs beyond those accepted by its model checker.
Runway has a type system, which can feel restrictive when trying to do something unanticipated in the language design (for example, cross product).
Runway does not currently support user-defined operators (note: these are different from functions), refinement or temporal properties.
Runway more resembles a conventional programming language; this means it is more verbose (see below), but may be easier to parse for simple static code analysis or mainstream IDE features such as auto-completion.
Same as TLA+, Runway has sequences, sets, and records, but also includes a nice enum type.
TLA+ is much more concise than Runway. My elevator TLA+ spec clocks in at around 225 lines including liveness properties, versus over 450 for the Runway safety spec. We can see the difference in the logic to determine whether an elevator ought to stop at a floor (we stop if either the elevator has the button for that floor pressed, or there is an active call at that floor):

TLA+

IsDestination[e \in Elevator] ==
LET eState == ElevatorState[e] IN
\/ \E call \in ActiveElevatorCalls : call.floor = eState.floor /\ call.direction = eState.direction
\/ eState.floor \in eState.buttonsPressed

Runway

function isDestination(floor: Floor, elevator: Elevator) -> Boolean {
for pid in elevator.riders {
match people[pid] {
Riding(r) {
if r.destination == floor {
return True;
}
}
default { assert False; }
}
}
for call in elevator.floorCalls {
if (call.floor == floor && call.direction == elevator.direction) {
return True;
}
}
return False;
}

Runway's REPL made it easy to get hands-on with the language, and its familiar syntax might endear it to formal specification newcomers. The language was easy to pick up from a TLA+ background, modulo some confusion about ranges (1 .. N) and their relation to sets. I also really liked Runway's enum type, which replaces the common TLA+ approach of using hardcoded strings to indicate state.

Here is the full Runway language documentation.

Runway Feature: The Simulator

The Runway simulator generates a single behavior from the spec. TLC has a similar "depth first" mode, but Runway has something TLC lacks: a global clock. The global clock is used to assign time costs to actions; this works through a special function, past(t), which returns true only if t is less than the current global clock value. Sensibly, past(t) always returns true in normal model-checking mode. Here's how past(t) would look in TLA+:

CONSTANT Actors
VARIABLES WakeTime, GlobalClock

TypeInvariant ==
/\ WakeTime \in [Actors -> Nat]
/\ GlobalClock \in Nat

past(t) == t <= GlobalClock

later(t) == GlobalClock + t

Init ==
/\ WakeTime = [Actors |-> 0]
/\ GlobalClock = 0

Tick ==
/\ GlobalClock' = GlobalClock + 1
/\ UNCHANGED WakeTime

WakeAndDoStuff(a) ==
/\ past(WakeTime[a])
/\ WakeTime' = [WakeTime EXCEPT ![a] = later(5)]
/\ UNCHANGED GlobalClock

Next ==
\/ Tick
\/ \E a \in Actors : WakeAndDoStuff(a)

In Runway, the GlobalClock variable is implicit and its associated functions/actions are part of the language itself. The simulator logic also privileges non-Tick actions, so if WakeAndDoStuff(a) is enabled for any Actor then that action is taken rather than Tick (also, the implementation skips ahead the global clock to the next interesting time). Random choice handles nondeterminism. Speaking of randomness, Runway includes functions to inject probabilistic behavior into the simulation. For example, we can add random network latency:

var messageArrivesAt : Time = later(urandomRange(0, 10000));
rule ProcessMessage {
if (past(messageArrivesAt)) { DoStuff(); }
}

Putting it all together, you can simulate your system in real(ish) world conditions to identify bottlenecks or even gather performance metrics! I can imagine this as a neat way to explore the efficacy of different scheduling algorithms.

Runway Feature: The Visualizer

When I first came across Runway I was most interested in the visualizer. I recall the simulator/visualizer combo as a common request from pupils in the TLA+ class, and was eager to see how well it worked in practice. The theory is you write a bunch of _javascript_ code which hooks into your model simulation and uses the D3 visualization library to associate graphics & animation with each state and action. I'm very much a _javascript_ amateur and have never worked with D3. I'd like to say I accomplished my goal of getting the DieHard visualizer to work, but I didn't - the only current documentation is the raft example, which I found far too complicated. I encourage you to look at the visualizer examples here for examples of what is possible. Perhaps I will return to this project once the documentation is better fleshed out.

The Future

TLA+ can only benefit from Runway's existence. Getting more people into formal specification, especially students, normalizes TLA+'s value proposition even if they aren't using TLA+ specifically. Showcasing TLA+ for people already practicing formal specification is much, much easier than pitching people cold. This isn't to say Runway doesn't compete with TLA+; it does, and I hope the two languages can learn from one another as they explore different formal specification use cases. I call attention to the REPL specifically as something which would benefit TLA+ (such a REPL has been discussed here before). I found Runway's REPL very helpful in quickly prototyping functions and state transitions, and I believe a TLA+ REPL would be of great assistance to newcomers.

TLA+ might also learn from Runway's embrace of modern development styles, which emphasize standalone tools working together rather than the monolithic toolset approach taken by TLA+. For example, it's very obvious which Runway files ought to be checked in to source control whereas with TLC it isn't clear which files in the *.toolbox directories are important. Runway also puts support for user preference in text editors front-and-center, with officially-supported modules for Vim and Atom. The model checker and visualizer modules are separate command line tools, installed by cloning a git repo then running a package manager. This approach aims to keep the user at home in their git + command line + text editor dev environment, keeping IDE magic to a minimum.

Particularly with its visualization module, Runway is ideally situated to be used as a formal specification system in university class projects. The prospect of students exiting university with formal specification experience is exciting, and I hope it's a hit! Runway's verbosity becomes somewhat overwhelming in more complicated specifications, but the language is still very young and many rough edges can be sanded down as it grows. I look forward to its contributions to the formal specification space, and congratulate Ongaro et al. for shipping!

Andrew