[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlaplus] Re: Specifying a stopping condition



Am Dienstag, 18. Dezember 2018 19:11:58 UTC+1 schrieb Stephan Merz:
> Hello,
> 
> 
> without seeing the specification, it's not completely obvious to understand what you are trying to achieve and what the counter-example is. I understand that the property you are interested in is <>[]AllPrintJobsFinished, where the predicate is defined as
> 
> 
> AllPrintJobsFinished == \A printjob \in PRINTJOB : printjobState[printjob] \in {"printed","failed"}
> 
> 
> 
> I imagine that PRINTJOB is a fixed set of job IDs.
> 
> 
> The next-state relation has two main disjuncts
> 
> 
> Next ==
>     \/ Advance
>     \/ Failure
> 
> 
> 
> where action Advance models the regular progression of print jobs (itself subdivided into three steps) and Failure models a worker restarting. However, you do not explain how such a "restart" is modeled: one may imagine that it sets the job the worker is handling to "failed". Can a new job with the same ID as the failed one be started later? In that case, it is easy to picture a counter-example in which a worker accepts a job, fails, accepts a new job (with the same ID), and so on forever. If on the other hand the job remains "failed" forever, then one would assume that eventually no worker can accept a new job and therefore eventually all jobs must either be "failed" or "printed", given your fairness condition.
> 
> 
> If you cannot figure out the problem yourself, feel free to send me the spec by direct email.
> 
> 
> Regards,
> Stephan
> 
> 
> 
> 
> 
> On 17 Dec 2018, at 21:54, nic...@xxxxxxxxx wrote:
> 
> Am Sonntag, 16. Dezember 2018 16:45:31 UTC+1 schrieb nic...@xxxxxxxxx:
> Am Sonntag, 16. Dezember 2018 02:09:31 UTC+1 schrieb Leslie Lamport:
> If, at the end of error trace, it says something like "back to state 17", then the error trace is an infinite trace that ends by infinitely repeating the sequence of states from state number 17 through the last displayed state.
> 
> 
> Leslie 
> 
> 
> On Saturday, December 15, 2018 at 4:36:27 PM UTC-8, nickkell at gmail.com wrote:Hello all!
> 
> 
> 
> I've been working on a spec for a print queue, and ideally it should stop when all the items to be printed are either "printed" or "failed". Multiple "worker" applications are competing to be the next to print for any given printer so have to acquire a lock. I also wanted to model what happens when the workers crash and restart. This is the end of the spec (for brevity):
> 
> 
> 
> RetrieveNextPrintjob == \E worker \in WORKER, printer \in PRINTER : GetNextForPrinting(worker, printer)
> 
> DoPrint == \E worker \in WORKER, printjob \in PRINTJOB : WorkerProcessPrintJob(worker, printjob)
> 
> UpdateStatus == \E worker \in WORKER, printjob \in PRINTJOB : HubReceivedPrintJobMessage(worker, printjob) 
> 
> Advance == 
> 
>     \/ RetrieveNextPrintjob    
> 
>     \/ DoPrint
> 
>     \/ UpdateStatus
> 
> 
> 
> Failure == \E worker \in WORKER : WorkerRestart(worker)
> 
> 
> 
> Next ==
> 
>     \/ Advance
> 
>     \/ Failure    
> 
>     
> 
> PQ == Init /\ [][Next]_vars
> 
> 
> 
> AllPrintJobsFinished == \A printjob \in PRINTJOB : printjobState[printjob] \in {"printed","failed"}
> 
> 
> 
> Spec == PQ /\ SF_vars(Advance)
> 
> 
> 
> Liveness == <>[]AllPrintJobsFinished
> 
> 
> 
> 
> 
> The model checker seems to report that after a WorkerRestart, it goes back to a previous state. Does this mean it will repeatedly flip back and forth without terminating?
> 
> Thanks for your reply. So actions "Advance" and "Failure" will both be enabled, and it can always choose to fail, without advancing towards termination. Should I instead specify that action Advance will always be enabled when AllPrintJobsFinished is false? If so, then how?
> 
> I have a solution, but I don't know if it's correct or just a workaround. I've introduced a counter that's incremented by the WorkerRestart action, which (along with an upper limit as a State Constraint) prevents an infinite cycle of states. 
> Feel free to tell me if it's silly, otherwise I'll remain blissful in my ignorance :)
> 
> -- 
> You received this message because you are subscribed to the Google Groups "tlaplus" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to tlaplu...@xxxxxxxxxxxxxxxx.
> To post to this group, send email to tla...@xxxxxxxxxxxxxxxx.
> Visit this group at https://groups.google.com/group/tlaplus.
> For more options, visit https://groups.google.com/d/optout.

Hi Stephan. You guessed mostly correctly. "Queued" printjobs can be retrieved repeatedly, up until they are marked as "printing". I was trying to model a simple bug whereby the worker initially acquires a lock on a queued printjob, then restarts, but can't retrieve it again because we assert that the printjob must have no lock, rather than no lock OR is already locked by that worker making the request. In this case neither worker could move that printjob beyond queued. 
A variable is used to track which printjobs are being printed by each worker, and is cleared on a restart. This is essentially a backwards step with regards to the termination state