Re: [tlaplus] TLC not handling disjunction correctly in special case?

Hi Andrew,

From the perspective of using TLA+/TLC to establish the correctness of architects' and designers' (something I did for many years in industry), what you are seeing is a desired behavior.

(Allow me to apologize for such a long reply to a simple question. Our team debated ideas surrounding what you touched on many times over multiple projects with lots of customers. I have much to say as a result. Let me know if any of the following doesn't make sense. Hopefully the length won't bore you to death.)

First, I want to say something we had to remind ourselves regularly: We write specifications in TLA+, not programs or informal descriptions.

With that said, let me generalize your Formula action and add a next state relation to it.

Formula ==

/\ \/ A

\/ B

/\ N

where N is the next state relation and A and B are preconditions for N to occur. In this form, what Formula says to the architects and designers I worked with is

If the system is in a state where A is true, then N can occur.

and

If the system is in a state where B is true, then N can occur.

I used "and" there to clarify that A and B are separate conditions even though in our TLA+ specification we would combine them disjunctively. In a specification and in their minds, there is no relationship between A and B. Formula is really two specifications in one because of the disjunction.

Thus, if my customers saw your specific version of Formula, they would interpret that as

If record has a NoRecord value, then the system can act

and

If record.key is true, then the system can act

They would immediately balk at the second specification because there is no check that record has a valid key. And they would be right. So, even before running TLC I would have to rewrite the TLA+ as

Formula ==

/\ \/ record = NoRecord

\/ /\ record # NoRecord

/\ record.key

/\ NextStateStuff

We would write the spec this way instead of using an if-then or implication because the specification is clearer and fully spelled out. When the architects and designers read the specifications, they want to see the clear, specific rules. If the rules require them to think about language semantics or take data construction implications into mind, then they are not good rules and they will rightly push back. They would push back on your second disjunct because it implies record has to be checked first. They don't want to have to figure that out. Their jobs are stressful enough as it is. Plus, leaving the implication for them to figure out provides an effective path for a bug to get into the design, which runs counter to why we write specifications in the first place: To prevent bugs in designs.

For us folks using TLA+/TLC, we want TLC to handle the disjuncts separately. TLC parallelizes disjunction analysis, thank goodness. In our designs, we had so many actions with so many disjuncts that if TLC evaluated them all in order to check for side effects, performance would have tanked. In a fully mature specification, the next state relation is an enormous disjunction (of nested disjunctions of nested disjunctions of...). Evaluating all that in order would be prohibitive.

In short, for the sake of writing clear specifications, the TLC exception you are seeing is a good thing, and for TLC performance, we want it to evaluate disjuncts in parallel.

Sorry again for a complex answer to a simple question.

Robert

On Mon, May 22, 2017 at 4:15 PM, Andrew Helwer <andrew...@xxxxxxxxx> wrote:

I've created a minimal spec to demonstrate what might be a case of TLC not properly handling disjunction. Running TLC on this spec produces a runtime exception. The case occurs when you have a variable which can either hold a record value or a null value; this formula will throw an exception:

Formula ==
    \/ record = NoRecord
    \/ record.key

Of course TLC isn't expected to exactly correspond to mathematical semantics (it cares about the order of elements in a conjunction when defining primed variable values, I believe) but might this be a case where TLC can "short-circuit" the evaluation of the formula once it sees the first disjunct is satisfied? Or even wrap every disjunct in a try/catch to see whether it can find one which both fails to throw an exception and evaluates to true? Maybe this has all been considered before and rejected because it produces surprising behaviour elsewhere.

Here's the full spec:

-------------------------------- MODULE Bug --------------------------------
EXTENDS     Naturals

VARIABLE    clock,
            record

RecordType == [key : BOOLEAN]

NoRecord == CHOOSE r : r \notin RecordType

TypeInvariant ==
    /\ clock \in Nat
    /\ record \in RecordType \cup {NoRecord}

Tick ==
    /\ \/ record = NoRecord
        \/ record.key
    /\ clock' = clock + 1
    /\ UNCHANGED <<record>>

CreateRecord ==
    /\ record = NoRecord
    /\ record' = [key |-> FALSE]
    /\ UNCHANGED <<clock>>

SetRecordFlag(value) ==
    /\ record /= NoRecord
    /\ record' = [record EXCEPT !.key = value]
    /\ UNCHANGED <<clock>>

DestroyRecord ==
    /\ record /= NoRecord
    /\ record' = NoRecord
    /\ UNCHANGED <<clock>>

Init ==
    /\ clock = 0
    /\ record = NoRecord

Next ==
    \/ Tick
    \/ CreateRecord
    \/ \E t \in BOOLEAN : SetRecordFlag(t)
    \/ DestroyRecord
=============================================================================

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@googlegroups.com.
To post to this group, send email to tla...@xxxxxxxxxxxxxxxx.
Visit this group at https://groups.google.com/group/tlaplus.
For more options, visit https://groups.google.com/d/optout.