Re: [tlaplus] Re: Reusing Specs

Dear Rodrigo,

As Leslie Lamport has kindly stated in the brilliant note "Reusing Specs", the correct way of reusing a TLA+ spec is "copy-paste-modify".

First let us be clear on the meaning of "reuse". As programmers, we tend to put pieces of code that we would "use again without modifications" in the future in a library, for example. I think this is because we are much used to living with "black boxes". For example, we call a function in a C library and we would expect it to magically perform a specific task, and we would rely on (possibly written!) comments of other programmers.

There are no black boxes in the world of formal specification using TLA+.

I think that there is an obvious reason for this. When we write a specification we would require a "single competent judge" (usually the writer of the spec) to make sure that everything is correct. Writing programs is different. Hundreds of programmers contribute to a software development cycle these days, and we cannot expect them to copy-paste-modify everything.

But system specification is completely different. A specification is an abstraction, which means it has less details. It would be really hard to write a good spec that is longer than a couple of thousands of lines. Perhaps that is because one person finds it difficult to understand longer specs, as it is mentioned in the original note "Reusing Specs".

Specifications are not comparable to programs in terms of number of written lines. We write specifications to build correct systems. Building a system is different than creating a piece of art and we are concerned with understandability of our work.

Finally, reusing specs by adding extra parameters is possible, but it is discouraged and it is not the right way to specify a system. A good spec should have everything required to understand it inside itself, instead of referring to other specs for clarification.

Sincerely,
AmirHossein

On Tue, Jun 18, 2019 at 8:01 PM Rodrigo Q. Saramago <deepmarolaest@xxxxxxxxx> wrote:

Hi, Lamport reading your post made me think about some specifications that a friend and I wrote some months ago.

Our goal was to write a specification for a state machine replication (RSM) protocol that uses multiple instances of
the "Paxos-family" protocols to decide on each entry of a replicated log, and we would like to have a way to test it
on various failures scenarios (because we were learning about these protocols).

In that time we also wanted to learn more about TLA+, so we decided to start by getting your specification of Paxos
from the Github (Paxos.tla) and do a refinement mapping of it in our specification.

We noted that your Paxos specification implements a Voting specification which in turn implements the Consensus
spec. So we thought this was the natural way to proceed.

Then we copy all of these three specs, and we apply minor modifications to your original Paxos specification in
order to make it more modular for our needs, but trying to not change its logic. We didn't modify the Voting or the
Consensus spec.

We basically split the formula definition of some steps (like the phase1b bellow) in two different operators:
"condition" and "action", which allowed us to reuse these operators in the RSM spec and also overwriting it using
the configuration files (to tests slight modifications of some operators and simulate failures, for example).

----------------------------Paxos----------------------------------------
VARIABLE state
...
Init == state = [maxBal |-> [a \in Acceptor |-> -1],
                       maxVBal |-> [a \in Acceptor |-> -1],
                   maxVVal |-> [a \in Acceptor |-> None],
                   msgs |-> {}]
...
Phase1bCondition(a, m) == /\ m.type = "1a"
                         /\ m.bal > state.maxBal[a]

Phase1bAction(a, m) ==
    [state EXCEPT !.maxBal[a] = m.bal,
                   !.msgs = Send([type |-> "1b",
                                   acc |-> a,
                                                   bal |-> m.bal,
                                     mbal |-> state.maxVBal[a],
                                                   mval |-> state.maxVVal[a]])]

Phase1b(a) == \E m \in state.msgs :
                                                      /\ Phase1bCondition(a, m)
                /\ state' = Phase1bAction(a, m)
------------------------------------------------------------------------

---------------------------RSM-Paxos---------------------------
VARIABLE states,        \* Array of paxos states
           log,           \* Array os decided values
                 proposed       \* All instances knows about the proposed values
...
P(i) == INSTANCE Paxos WITH Acceptor <- Replica,
                              state    <- states[i]
...
Phase1b(i,r) == /\ \E m \in states[i].msgs : /\ P(i)!Phase1bCondition(r, m)
                                               /\ states' = [states EXCEPT ![i] = P(i)!Phase1bAction(r, m)]
                /\ UNCHANGED <<log, proposed>>
------------------------------------------------------------------------

After played with crash-stop models, testing it using different configuration files and making assertions about the
invariances, we decided to move to the other failure models, crash-recovery and also Byzantine. We ended up
implementing a spec that extends our RSM and works as a "test-suite framework", declaring our test models that
are "instantiated" using the TLC configuration file.

For example, we use your byzantine paxos spec: Byzantine Paxos Refinement, and apply minor modifications to
our RSM spec to have a Byzantine version of it.

---------------------------RSM-BPConProof------------------------------------
BP(i) == INSTANCE BPConProof WITH Acceptor     <- CorrectReplica,
                                     FakeAcceptor <- FakeReplica,
                                                       state        <- states[i]
...
Phase1b(i,r,b) == /\ BP(i)!Phase1bCondition(r,b)
                /\ states' = [states EXCEPT ![i] = BP(i)!Phase1bAction(r,b)]
                /\ UNCHANGED <<log, proposed, bal>>

Phase1c(i,b) == /\ \E S \in Possible1cMsgsIn(b):
                         states' = [states EXCEPT ![i] = BP(i)!Phase1cAction(S)]
                  /\ UNCHANGED <<log, proposed, bal>>
...
-----------------------------------------------------------------------------

So reuse existing specification in this way shown very valuable for us in that time, and together with the papers,
helped us to understand better the protocols. But now, I'm wondering if our approach was completely wrong and
we misuse TLA+, TLC, and the toolbox. Because we kind of copy/paste/modify the original Paxos and ByzPaxos
specs, but we didn't modify the Voting and Consensus that still used internally by them.

Rodrigo

Em sexta-feira, 14 de junho de 2019 16:19:51 UTC+2, Leslie Lamport escreveu:
I have observed that many new users want to write TLA+ specs so they
can be reused. I have one word of advice for those users: Don't.
The way to reuse a TLA+ spec is copy, paste, and modify.
This flies in the face of everything you've learned about writing
programs. It does so for the same reason that TLA+ does not look like
any programming language: specifications are very different from
programs.

The most obvious difference between them is size. I have the
impression that the specs written by engineers at Amazon and Microsoft
working on cloud systems are a few hundred lines long. Other than one
outlier, the longest specs I know of were written by engineers at
Intel. They were about 2000 lines long. I was told that this was
because that was the largest spec a single person could understand.
If a spec started becoming longer than that, they would remove details
by abstraction to make it smaller. Programming languages could be a
lot simpler if no one ever wrote a program longer than 2000 lines.

A more important difference appears when you consider what one reuses
when programming. If you're writing a sorting program, you don't make
it reusable in the sense that you could write the program so that, by
changing the parameters, you could change it from executing quick sort
to executing heap sort. Reusing means being able to write another
program that uses sorting as a primitive, without caring about how
it's implemented. In a TLA+ spec of an algorithm that uses sorting,
that use might appear as a subformula like A' = Sort(A), where Sort
has a 3-line definition. You hardly need a separate module for a 3-line
definition. A reusable sorting program needs to pass in arguments
specifying where to find the sorting key and what the ordering
relation is. In TLA+, you just copy a Sort definition used in one
spec and make some trivial modification to those three lines to
reflect the difference in the location of the sorting key and the
ordering relation. This is easy enough, and it makes the spec a lot
simpler than if you imported a single definition that could be reused
without modification.

A Sort operator is actually one of the most complicated parts of a
TLA+ spec you're likely to reuse. Consider message passing in a
distributed algorithm. If you wrote a distributed program and had to
write your own code for sending and receiving messages, you would
certainly not want to have to rewrite it for your next distributed
program. Now look at the spec of the Paxos consensus algorithm in the
TLA+ GitHub examples repository. Here is the "procedure" that's used
to send an arbitrary message m:

   Send(m) == msg' = msg \cup {m}

It was hardly worth writing that definition just to be able to write
"Send(m)" instead of "msg' = msg \cup {m}" throughout the spec. But I
thought it made the spec a little easier to read. There isn't even
any explicit receive action. The presence of a message to be received
is simply an enabling condition of an action. What can be reused from
the spec is the idea of representing message passing with a variable
containing the set of all messages that have been sent. Simple ideas
are expressed simply in TLA+; you don't need to reuse the "code" that
embodies them.

If you want to learn to write specs, write a spec of a particular
system or algorithm you're interested in. Worry about getting that
system or algorithm right. Don't think about what else you might want
to do with your spec. That's what I do. That's what engineers who use
TLA+ to build real systems do.

Leslie

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To post to this group, send email to tlaplus@xxxxxxxxxxxxxxxx.
Visit this group at https://groups.google.com/group/tlaplus.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/ea0fba55-fef3-46f8-b72b-004c63b1e388%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.