Re: [tlaplus] Modelling network partitions explicitly

Hi Stephan,

In fact all have that counter in reality, I use it to limit the state space. I just wanted to simplify it for this post so removed them but I missed that one you have pointed out.

Jack

On Tuesday, January 7, 2020 at 1:53:58 PM UTC+1, Stephan Merz wrote:

Hello,

your definitions look good to me. Obviously, modeling failures increases the number of reachable states with respect to a model that just assumes that everything works fine all of the time. The action RestoreLink (but curiously not RestoreAllLinks) introduces a counter that remembers how often a link has been restored – this will in general lead to an infinite state space and require a state constraint for model checking, and I'd consider to what extent that information is really relevant to understanding the system you are interested in and its properties.

Best,
Stephan

On 7 Jan 2020, at 12:49, jvanl...@xxxxxxxxxx wrote:

Hi,

I have written a spec for an Erlang based system. Erlang OTP has a feature called a monitor. It allows one process to be notified when another process has terminated or become unavailable and the system makes use of these monitors in multiple places.

I have not seen any TLA+ specifications model network partitions directly but I felt it was important for ensuring the use of monitors work as designed.

The way I have modelled it is with a function called link where N is the set of all nodes.

link \in [N -> [N -> {Up, Down}]]

This way I can model bidirectional and unidirectional partitions of any combination. For now I am modelling the loss and restoration of links between two specific nodes or all links to a specific node.

ReachableNodes(n) == { m \in N : link[n][m] = Up /\ machine[m].status # Offline } SeesMajority(n) == Cardinality(ReachableNodes(n)) >= (Cardinality(N) \div 2) + 1 LinkOk(n, m) == \/ n = m \/ /\ n # m /\ link[n][m] = Up LinkDown(n, m) == /\ n # m /\ link[n][m] = Down LoseLink(n, m) == /\ n # m /\ LinkOk(n, m) /\ link' = [nd \in N |-> CASE nd = n -> [link[nd] EXCEPT ![m] = Down] [] nd = m -> [link[nd] EXCEPT ![n] = Down] [] OTHER -> link[nd]] RestoreLink(n, m) == /\ n # m /\ LinkDown(n, m) /\ link' = [nd \in N |-> CASE nd = n -> [link[nd] EXCEPT ![m] = Up] [] nd = m -> [link[nd] EXCEPT ![n] = Up] [] OTHER -> link[nd]] /\ link_change_ctr' = link_change_ctr + 1 LoseAllLinks(n) == /\ \A m \in N : LinkOk(n, m) /\ link' = [nd \in N |-> IF nd = n THEN [m \in N |-> IF nd = m THEN Up ELSE Down] ELSE [link[nd] EXCEPT ![n] = Down]] RestoreAllLinks(n) == /\ \E m \in N : LinkDown(n, m) /\ link' = [nd \in N |-> IF nd = n THEN [m \in N |-> Up] ELSE [link[nd] EXCEPT ![n] = Up]]

Then I have some operators for determining if different actors in the system (Raft state machines, lock processes and transactions) are visible from a given node.

MachineVisibleFromNode(n, from_node) == /\ machine[n].status # Offline /\ LinkOk(n, from_node) LockerVisibleFromNode(l, from_node) == /\ locker[l].status \notin { NotStarted, Stopped } /\ LinkOk(locker[l].node, from_node) TransactionVisibleFromNode(t, from_node) == /\ transaction[t].status \notin { NotStarted, Stopped } /\ LinkOk(transaction[t].node, from_node)

My question is this: If I am not seeing examples where people model network partitions directly, is that something that is generally not required? Are there other ways of modelling behaviours without explicitly modelling visibility between different actors? I ask this as modelling network partitions has greatly increased the state space and if there are other ways then I'd love to hear about them.

Thanks
Jack

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tla...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/7ae0d862-a627-4171-8d3a-4b81d4658221%40googlegroups.com.