[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlaplus] How to get rid of the negative influence of a cycle timer mechanism



If I understand correctly, you are not interested in the value of Tick other than trying to ensure that sync messages are not sent "too often". However, action TimerProc(n) remains enabled, so the timer can just tick MAX_TIME times before a sync message is resent. Therefore, it seems to me that this just blows up your state space for no good purpose.

I'd rather remove the Tick variable and write

TimerProc(n) ==
  /\ NodeState[n] = "Work"
  /\ BroadcastSyncMessage(n)  \* assume that the timer has elapsed
  /\ UNCHANGED vars

Of course, this means that several sync messages can be sent in a row, without any other behavior taking place in between. (But remember that TLC explores all possible executions so more "reasonable" behaviors will be checked as well.) Informally, immediately resending sync messages corresponds to the system being extremely slow. The benefit of modeling timeouts non-deterministically in this way is that you get rid of the Tick variable and the associated blowup of the state space. If messages are transmitted through queues, you'll need to bound queue sizes anyway. If the network is modeled using sets (i.e., unordered messages without duplicates), resending a sync messages will just lead to the same state being revisited and will have essentially zero cost.

Hope this makes sense,
Stephan

On 15 Jun 2020, at 11:32, Earnshaw <dongluming77@xxxxxxxxx> wrote:

OK,  I have seen the previous discussion about timeout, in that case, timeout is used for the reliability protection of message transmission,  eg. once the packet is dropped, the reciever will enable the the timeout behavior,. so you can specify timeout action as a sucessor of  packet loss action in the state machine.

But in my case, the timeout action is periodically happened, it used to trigger state sync operation to the whole cluster periodically , so it is always enabled as long as the node is not faulty.
The following is my current  description about the cylic timer function for a dedicate node.
 
              ...
 
 TimerProc(n)  == /\ NodeState[n] = "Work"  
                            /\ ( IF Tick = MAX_TIME
                                 THEN   
                                         /\Tick' = 0
                                         /\ BroadcastSyncMessage(n)    \*when the cylic timer is out,  node send broadcast  message to the whole cluster  and reset the timer
                                 ELSE
                                        Tick' = Tick +1    \* the independent tick elapse ++
                              )
                  
                           /\  UNCHANGED vars


 NextForNode(n) ==
    
         \ /     TimerProc(n)
    \/ ...                \* other service behavior

is that OK?     I am not so content with the formula above, because I don't need to module the real time action of the system, so I want to know if the Tick variable could be eliminated?

I am also afraid that  since  the TimerProc()  is always enabled, will it leads to state space explosion of the whole system?





在 2020年6月15日星期一 UTC+8下午2:25:05,Stephan Merz写道:
I am not really sure that I understand your question. Of course, TLA+ is meant to model systems whose behavior "never ends", and as long as the state space is finite, TLC can verify properties of such systems.

There have been previous discussions on modeling timers and timeouts in this Google group, e.g. [1]. Perhaps you find some hints there, otherwise feel free to come back, perhaps with a simplified example that explains the issue.

Thanks,
Stephan

[1] https://groups.google.com/forum/#!topic/tlaplus/mvrg99_xc74

On 15 Jun 2020, at 05:46, Earnshaw <donglu...@xxxxxxxxx> wrote:

Dear all:

   I am specifying a distributed system consists of a cluster of nodes.      
  
   Each node's service state will be syncronized to the whole cluster periodically.

   In the  implementtation layer,  there is a cycle timer in each node and generate syncronization message periodically.


   What really confused me is how to specify this mechanism abstractly.  

   It is easy to simulate the timeout method  under the pre-condition that the node is in working state, but the result is that the state machine will always driven by the timeout mechanism and never end.

   This scenario is somewhat  like the so called stuttering state if the system had been in convergeny and without other valid trigger.

   So, what is the principle to get rid of the stuttering cycle timer's influence in buliding the model?


    Best Regards

   
  

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tla...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/c3bcf614-06bd-4990-853a-2c30c224b31eo%40googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/8600bd44-ce22-4d75-89c1-e613ce9899d8o%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/D36F6A12-9C55-41E1-8097-356C7B6CB925%40gmail.com.