[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tlaplus] Modelling network partitions explicitly


I have written a spec for an Erlang based system. Erlang OTP has a feature called a monitor. It allows one process to be notified when another process has terminated or become unavailable and the system makes use of these monitors in multiple places.

I have not seen any TLA+ specifications model network partitions directly but I felt it was important for ensuring the use of monitors work as designed.

The way I have modelled it is with a function called link where N is the set of all nodes.

link \in [N -> [N -> {Up, Down}]]

This way I can model bidirectional and unidirectional partitions of any combination. For now I am modelling the loss and restoration of links between two specific nodes or all links to a specific node.

ReachableNodes(n) ==
{ m \in N : link[n][m] = Up /\ machine[m].status # Offline }

SeesMajority(n) ==
Cardinality(ReachableNodes(n)) >= (Cardinality(N) \div 2) + 1

LinkOk(n, m) ==
\/ n = m
\/ /\ n # m
/\ link[n][m] = Up
LinkDown(n, m) ==
/\ n # m
\ link[n][m] = Down

LoseLink(n, m) ==
/\ n # m
\ LinkOk(n, m)
/\ link' = [nd \in N |->
                    CASE nd = n -> [link[nd] EXCEPT ![m] = Down]
                    []   nd = m -> [link[nd] EXCEPT ![n] = Down]
                    [] OTHER -> link[nd]]

RestoreLink(n, m) ==
    /\ n # m
    /\ LinkDown(n, m)
    /\ link'
= [nd \in N |->
                    CASE nd
= n -> [link[nd] EXCEPT ![m] = Up]
[]   nd = m -> [link[nd] EXCEPT ![n] = Up]
[] OTHER -> link[nd]]
/\ link_change_ctr' = link_change_ctr + 1

LoseAllLinks(n) ==
    /\ \A m \in N : LinkOk(n, m)
    /\ link'
= [nd \in N |->
                    IF nd
= n THEN [m \in N |-> IF nd = m THEN Up ELSE Down]
[link[nd] EXCEPT ![n] = Down]]

RestoreAllLinks(n) ==
/\ \E m \in N : LinkDown(n, m)
\ link' = [nd \in N |->
                    IF nd = n THEN [m \in N |-> Up]
                    ELSE [link[nd] EXCEPT ![n] = Up]]

Then I have some operators for determining if different actors in the system (Raft state machines, lock processes and transactions) are visible from a given node.

MachineVisibleFromNode(n, from_node) ==
/\ machine[n].status # Offline
\ LinkOk(n, from_node)
LockerVisibleFromNode(l, from_node) ==
/\ locker[l].status \notin { NotStarted, Stopped }
\ LinkOk(locker[l].node, from_node)
TransactionVisibleFromNode(t, from_node) ==
/\ transaction[t].status \notin { NotStarted, Stopped }
\ LinkOk(transaction[t].node, from_node)

My question is this: If I am not seeing examples where people model network partitions directly, is that something that is generally not required? Are there other ways of modelling behaviours without explicitly modelling visibility between different actors? I ask this as modelling network partitions has greatly increased the state space and if there are other ways then I'd love to hear about them.


You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/7ae0d862-a627-4171-8d3a-4b81d4658221%40googlegroups.com.