Re: Implementation checking

I also find this problem really interesting. There's always a gap between the actual implementation and the abstract model. Some static analysis tools try to abstract a model from the code and use model checking to verify the implementation. Another approach is to use some kinds of dynamic analysis techniques as you discussed. If we treat the runnable program as a more concrete implementation of an abstract model, we can use the similar approach in TLA+ to verify the program, that:

Mapping the state variables in program P to the variables in model M and track the state transitions. Each transition in program P should be a valid action in model M, that Spec_P => Spec_M.

However, I found there are some obvious problems:

1. How to track the state changes in a program?

2. A real program has an extremely larger state space, due to the increasing details and smaller atomicity level. It's not possible to enumerate all the states.

3. A more realistic approach might be monitor the system to prevent reaching some unexpected states.

I think verifying a program with its specification is really attracting, but still requires more effort to make it more practical.

Best,

On Tuesday, May 15, 2018 at 12:40:10 AM UTC-4, cola...@xxxxxxxxx wrote:

Hi all, it occurred to me recently that it would be useful to verify that the behavior of an actual running system matches a TLA+ specification. The way I see this working is:

1. The system under test is augmented to emit a set of TLA+ variable assignments (states) matching its internal state when it first starts, and thereafter each time the state changes. For ease of use these could be formatted as JSON objects, which match the TLA+ data model quite well, and for which libraries exist in all common languages.

2. These variable assignment sets are to be streamed (via a socket) into an instance of TLC running in a new mode, call it "implementation checking mode" (perhaps there is a better name). A command-line flag could indicate whether partial assignments are allowed to be present on the stream (which should be interpreted as deltas from the previous state).

3. In this mode, TLC processes the model provided on the command line as usual with the following exceptions:

a) All CONSTRAINT, VIEW, SYMMETRY, INVARIANT, and PROPERTY commands are ignored.

b) What would become Init, Next, and Temporal formulae are used instead as ImpliedInit, ImpliedAction, Invariant, and ImpliedTemporal.

c) Variables are initialized to the first variable assignment set streamed to TLC, as if by an Init formula. Each successive assignment that is streamed to TLC steps the system to a new state, as if by a Next formula.

4. Model checking is performed mostly as usual, but with several important differences:

a) No state graph is maintained. (This has several consequences noted below.) However for purposes of error reporting a limited history could be maintained. (This is necessary to ensure that this feature can be used against a system left running indefinitely.)

b) ImpliedInit, ImpliedAction, and Invariant are checked in the obvious manner. If a violation is found, the error is reported both on the console (including at least the values of all variables in the previous and current states) and on the socket (simply as a notification to the running system that it may wish to cease operation). TLC will then stop processing, but should leave the socket open until the client closes it.

c) At each state, subformulae of ImpliedTemporal are satisfied or falsified as appropriate, altering the formula for future steps. (The exact rules are not complex but are too detailed for this summary.) This may of course cause the entire ImpliedTemporal formula to be falsified; if so, a violation is immediately reported as for an Invariant.

5) When the stream is closed by the system under test, TLC exits and reports success (along with some fun statistics, including model coverage if requested). However, the system under test may emit, as its final "state", an indication that the final (previous) state should be considered to stutter indefinitely. (If JSON is used, the JSON null value should suffice.) In this case, TLC will check a graph consisting of exactly the final state and a self-transition against the (altered) ImpliedTemporal formula, reporting a violation as above.

I believe that such a feature could help close the gap between verified models and verified code, by essentially allowing a verified model to be used as assertions in a testing scenario. Certainly, I see cases at my day job where such a feature could be put to immediate use.

(Aside -- a related feature which may be useful is the ability to ingest the output of a TLC error trace as a series of states. This could be used for quickly testing that a fix to a model eliminates an error trace as a possible history, before committing to run the entire model again (if even from a checkpoint).)

Do others see value in this proposed feature? Does the proposed design seem sound? If so I'll probably work on developing it in my spare time, as it certainly will be useful to me. No promises when it might be complete though ;)