[tlaplus] Re: Question: Given a system where an input can be any int, how can you describe it with TLA+ in a way that TLC can check?

On Sunday, December 13, 2020 at 12:29:03 PM UTC-5 Leslie Lamport wrote:

Hi Ari,

That's a good question, and it deserves a good answer. Let's see what
I can do.

Why do we have an infinite set of natural numbers? Before there were
computers, no one was ever going to write a number with more that a
few hundred digits. With a computer, no one is ever going to write a
number with more than a few billion digits. Why don't we just declare
that there is some largest natural number so we don't have to deal
with an infinite set?

The answer is: for simplicity. The laws of arithmetic would be much
more complicated with a finite number of integers. For example,
3*n - 2*n = n would be false for must integers n.

Similarly, why did you want to write a spec of a system in which a
variable x can equal any natural number? A system that implements the
spec will break in an execution described by a behavior of the spec in
which the value of x gets too large. The answer is the same: for
simplicity.

A specification is an abstract model of a system. As someone almost
said:

No model is accurate; some models are useful.

You wrote a spec in which x can get arbitrarily large because either
(i) you believe that in practice the value of x will never get large
enough to worry about, or (ii) what to do if x gets too large is a
detail that is irrelevant to the purpose for which you wrote the
spec. In case (ii), the implementation will have to deal with the
value of x getting too big, but you didn't want to worry about that
in your spec. You wrote the spec to model some other aspect of the
system.

As Stephan wrote, math has no trouble dealing with an infinite set of
integers, and using them will make the proof simpler. But TLC can't
deal with infinite sets. With most model checkers, you would have to
rewrite the spec to check it, making it allow only a finite set of
possible states. The goal of TLC is to be able to check any spec
without modifying it. You do that by letting TLC check a finite model
of the spec. TLC models provide two features for doing that. First,
you can tell it to replace the set Nat by a finite set of numbers.
Second, you can tell the model checker to ignore states in which the
value of x is greater than some value.

Checking a model of the spec doesn't prove that the spec is correct.
But correctness of the spec doesn't prove that the system is correct.
Remember, that a spec is an abstract model. It's not supposed to be
accurate; it's supposed to be useful. Its use is to eliminate some
class of errors--in particular, what one would call design errors
rather than coding errors. Checking a model of the spec is useful if
it can catch errors in the spec.

In practice, checking a large enough model is a good way of finding
errors. In principle, a violation of a safety property can be always
be found by checking a large enough model. That's not true of
liveness properties. In practice, a finite model most often causes
TLC to find a liveness error that doesn't exist in the spec. Ensuring
that something eventually happens in all executions can require
executions in which the value of x becomes arbitrarily large. It's
also possible that a finite model hides the fact that a liveness
property isn't satisfied, but that seems to happen less often.

What constitutes a "large enough" model? The larger the model, the
more chance that it will catch all errors in the spec and the more
confidence it provides. In practice, it's a matter of engineering
judgement whether the models that TLC is able to check give you
sufficient confidence in the correctness of the spec. This seems to
work quite well for catching safety errors. I don't have enough data
to know how well it works in industry for catching liveness errors.
My own personal experience has been that safety errors tend to be
more subtle and easier to make than liveness errors.

Leslie

On Sunday, December 13, 2020 at 12:45:55 AM UTC-8 Ari Sweedler wrote:
I first asked this question on Reddit, https://www.reddit.com/r/tlaplus/comments/kc75n0/given_a_system_where_an_input_can_be_any_int_how/, but figured this was a better place to ask. I'll copy/paste my post & question here to save y'all from having to click that link:

Given a system where an input can be any int, how can you describe it with TLA+ in a way that TLC can check?

Btw, extreme noob here. Started Hillel Wayne's book. Saw this https://www.learntla.com/introduction/example/

If we actually wanted to test all possible cases, we could replace money \in 1..20 with money \in Nat, where Nat is the set of natural numbers. This is perfectly valid TLA+. Unfortunately, it’s also something the model checker can’t handle. TLC can only check a subset of TLA+, and infinite sets aren’t part of that.

And this TLA+ started as a notation, and the model checker, TLC, only came out 5 years later. As it was never intended to be run, there’s some assumptions of it being a read document instead of being runnable code.

So I kinda get it, the mathematical proof of "All natural numbers" isn't computable. However, I am first a computer scientist and second,,, well not even second. And thirdly a (wannabe) mathematician. So the fact that Nat is not a finite number. To put it in the language of mathematics, this is the finite set of the values of all valid ints: GF(1)32. (On a 32-bit system, that is).

So should I write something like \in 0..4294967296 or \in -2147483648..2147483647 everywhere in my code? That's not reasonable. Unless TLA+ does something fancy with efficiency (something inductive, to prove that the remaining values follow from prior cases), then I am wrong to want to test all integers.

But then... what's the point of a spec that isn't exhaustive? Can someone help explain?

Thanks,
Ari