[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlaplus] Liveness only when a certain condition holds



Hi Calvin,

First of all, thanks for writing such a detailed response.

Your abstract specification is very close to what I'm having to deal with here. Although I did consider using refinement, it wouldn't be as abstract as yours; I'd probably just reuse my correctness property there as well. Writing an abstract specification as the property itself through refinement is new to me, so this is a welcome suggestion (plus, a whole new approach to use with other specifications too!).

I was avoiding using refinement, but I'll give it a shot.

(I'm still curious about my broken property, though, in case anyone has any ideas... Hahah)

Best,
Jones

On Wednesday, 10 May 2023 at 18:39:44 UTC-3 Calvin Loncaric wrote:
Hello,

Instead of addressing your direct question ("what's wrong with my property?"), I want to suggest a different (maybe better) way to define and check the property you're after.

Temporal formulas can be very surprising. Personally, I find them extremely difficult to read and write and reason about. I view anything more complicated than []<>P or P~>Q with suspicion and fear.

Fortunately, TLA+ offers another way to specify and check behaviors: "refinement". Chapters 4 and 5 of Specifying Systems covers this in depth, but if I were trying to formalize the property you care about, I would start with an entirely new, very simple specification, inspired by your own description of the property:

> an election only works if no node fails during it. [and it must eventually succeed if no node fails during it]

In other words, you can describe your system as a state machine where
  • An election can start
  • While an election is in progress, either
    • A participant can fail, or
    • The election eventually succeeds.
"Elections.tla" (attached) is a very abstract description of your system that just captures the relevant actions and liveness properties. I also added some bonus actions like "nodes can recover" and "elections can abort", which may or may not be relevant to you.

For instance:
  • "ElectionSucceeds" says that if an election is in progress and enough nodes are active, the election can succeed.
  • "Spec" says that "ElectionSucceeds" is weakly fair---i.e. if enough nodes remain active for a long time then eventually the system makes progress and the election succeeds.
The "Elections" spec is not a model of your system; it's aspirational. But it is useful because you use it to formally state the property you care about:

"My proposed implementation beaves according to the aspirational behavior in 'Elections.tla'."

You can state that property with a few lines of "boilerplate" TLA+ that connect the two specs:

Abstract == INSTANCE Elections WITH
    election_in_progress <- ThereIsAnElection,
    enough_nodes_are_active <- Cardinality({n \in NodeSet: isActive[n]}) * 2 > Cardinality(NodeSet),
    election_successful <- ElectionWorks

ElectionCorrectness == Abstract!Spec

... or something along those lines. "ElectionCorrectness" is a property you can ask TLC to check.

Obviously this is a LOT less compact than a 5-line temporal formula! Even so, I strongly endorse defining correctness this way if you can. I find that the added verbosity helps make expectations about the system extremely precise and flexible.


--
Calvin


On Wed, May 10, 2023 at 12:47 PM Jones Martins <jone...@xxxxxxxxx> wrote:
Just to illustrate what is happening, I'm sharing 8 states out of the 17-state trace I got after verifying whether Spec satisfies Property (this time, with three nodes instead of two).

This trace is slightly edited for clarity, highlighting changed variables in green.

As a reminder, this is Property:

Property ==
  [](
    (/\ ThereIsAnElection
     /\ []EveryoneStaysActiveDuringElection) =>
       <>ElectionWorks
  )

According to the second condition in Property, state 8 shouldn't happen, because no element in 02_inElectionImpliesActive should be FALSE.

Alias 02_inElectionImpliesActive is defined as [ node \in NodeSet |-> InElection(node) => isActive[node] ],
whereas EveryoneStaysActiveDuringElection is defined as \A node \in NodeSet: InElection(node) => isActive[node].

1: <Initial predicate>
/\ 00_inElection = <<FALSE, FALSE, FALSE>>
/\ 01_someNodeInElection = FALSE
/\ 02_inElectionImpliesActive = <<TRUE, TRUE, TRUE>>
2: Fail
/\ 00_inElection = <<FALSE, FALSE, FALSE>>
/\ 01_someNodeInElection = FALSE
/\ 02_inElectionImpliesActive = <<TRUE, TRUE, TRUE>>
3: CheckCoordinator
/\ 00_inElection = <<FALSE, FALSE, FALSE>>
/\ 01_someNodeInElection = FALSE
/\ 02_inElectionImpliesActive = <<TRUE, TRUE, TRUE>>
4: Election0
/\ 00_inElection = <<FALSE, TRUE, FALSE>>
/\ 01_someNodeInElection = TRUE
/\ 02_inElectionImpliesActive = <<TRUE, TRUE, TRUE>>
5: Election1
/\ 00_inElection = <<FALSE, TRUE, FALSE>>
/\ 01_someNodeInElection = TRUE
/\ 02_inElectionImpliesActive = <<TRUE, TRUE, TRUE>>
6: Recover
/\ 00_inElection = <<FALSE, TRUE, FALSE>>
/\ 01_someNodeInElection = TRUE
/\ 02_inElectionImpliesActive = <<TRUE, TRUE, TRUE>>
7: Election2
/\ 00_inElection = <<TRUE, TRUE, FALSE>>
/\ 01_someNodeInElection = TRUE
/\ 02_inElectionImpliesActive = <<TRUE, TRUE, TRUE>>
8: Fail
/\ 00_inElection = <<TRUE, TRUE, FALSE>>
/\ 01_someNodeInElection = TRUE
/\ 02_inElectionImpliesActive = <<FALSE, TRUE, TRUE>>
...

Interestingly, checking Property2 == [](ThereIsAnElection => <>ElectionWorks) against

Next2 == Next /\ EveryoneStaysActiveDuringElection'
Spec2 == Init /\ [][Next2]_vars /\ Fairness

works just fine. (Thanks again, Hillel!)

So there's a problem in my definition of Property that I cannot find.


Best regards, and thanks again, everyone,

Jones

On Wednesday, 10 May 2023 at 16:11:15 UTC-3 Jones Martins wrote:
Hi everyone,

I'm back to square one after increasing the number of nodes and checking whether Spec satisfies Property.

TLC keeps showing state transitions that shouldn't happen, but checking whether Spec2 satisfies Property2 still works, so that has been my current approach.

Best,
Jones


On Tuesday, 9 May 2023 at 18:01:00 UTC-3 Jones Martins wrote:
Hi Hillel,

Your suggestion worked! That's the power of action constraints... The state space was different, though, and I'm not sure that's a problem.

Before: 
  Diameter: 22
  States found: 791
  Distinct: 355

After:
  Diameter: 22
  States found: 704
  Distinct: 316

Jones


On Tue, 9 May 2023 at 17:19, Hillel Wayne <hwa...@xxxxxxxxx> wrote:

I'd suggest a slightly different way of doing the same thing:


Next2 == Next /\ EveryoneStaysActiveDuringElection'

Spec2 == Init /\ [][Next2]_vars /\ Fairness


As for the initial problem, I think it's because EveryoneStaysActiveDuringElection is true when there isn't an election or if no nodes are part of the election, which the new version doesn't have. Maybe that's part of it?

H

On 5/9/2023 2:44 PM, Jones Martins wrote:
Hi again,

I found a fix, but I'm not very convinced...

Instead of verifying

Property ==
  [](
    (/\ ThereIsAnElection
     /\ EveryoneStaysActiveDuringElection) =>
       <>ElectionWorks
  )

I created a new formula Next2, where there's a modified version of my Fail(node) action (not contained in the first post): Fail2(node) == ~InElection(node) /\ Fail(node).

Then I modified Property into Property2 and verified that Spec == Init /\ [][Next2]_vars /\ Fairness satisfies Property2:

Property2 ==
  [](ThereIsAnElection => <>ElectionWorks)

And it worked!

Interestingly, I went to verify that Spec satisfies Property, and it also worked?! Color me surprised!

The only thing I changed in the actual specification was in the Recover(node) action (also not contained in the first post). I realized stuttering was happening where it shouldn't when verifying Property2. That's not the problem I was having before, and I can't see why this modification would fix that.

Conclusion:
 - the spec was incorrect;
 - I still don't know why TLC returned the counterexamples that it did before;
 - creating an alternative spec to verify certain properties is quite a good approach;

Jones

On Tuesday, 9 May 2023 at 15:12:12 UTC-3 Jones Martins wrote:
Hi Andrew and Stephan,


Andrew, yes, the same thing happens, unfortunately.

Stephan, I did try that before, but I forgot to add the second [], for some reason, and it still doesn't "filter" behaviors properly, although that's where I believe the problem is.

I also tried changing the formula's formatting by writing it in one line, adding parentheses, but that's not the issue either.

I don't doubt there's an error in my specification, but I expected that, if we define some property as []( (A /\ []B) => <>C ), TLC would ignore behaviors where either A is false or []B is false at some state before checking that <>C.


Jones
--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/55724505-e203-4176-a2c8-72a1d95ef0f3n%40googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "tlaplus" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tlaplus/EO9ff0OBqtA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tlaplus+u...@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/d49bbe36-1f70-545f-b251-7757e342f07c%40gmail.com.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@xxxxxxxxxxxxxxxx.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/637d0e31-eced-44a3-a6f4-fdf3cb250fa6n%40googlegroups.com.