[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlaplus] Re: TLC: NoSuchElementException on checkpoint restore



вторник, 4 апреля 2017 г., 21:14:27 UTC+3 пользователь Markus Alexander Kuppe написал:
> On 04.04.2017 17:09, evgeniy....@xxxxxxxxx wrote:
> > After recovering from checkpoint I have a very weird messages displayed periodically:
> > tlc RAPlus.tla -config RAPlus.cfg -deadlock -metadir /run/media/MC -checkpoint 30 -recover /run/media/MC/17-04-03-13-54-53
> > TLC2 Version 2.09 of 28 January 2016
> > Running in Model-Checking mode with 3 workers.
> > Parsing file RAPlus.tla
> > Parsing file /home/sh/tla2tools/tla/tla2sany/StandardModules/Naturals.tla
> > Parsing file /home/sh/tla2tools/tla/tla2sany/StandardModules/Sequences.tla
> > Parsing file /home/sh/tla2tools/tla/tla2sany/StandardModules/TLC.tla
> > Parsing file /home/sh/tla2tools/tla/tla2sany/StandardModules/FiniteSets.tla
> > Semantic processing of module Naturals
> > Semantic processing of module Sequences
> > Semantic processing of module TLC
> > Semantic processing of module FiniteSets
> > Semantic processing of module RAPlus
> > Starting... (2017-04-04 18:04:14)
> > Implied-temporal checking--satisfiability problem has 21 branches.
> > Starting recovery from checkpoint /run/media/sh/MC/17-04-03-13-54-53/
> > AAAAAA
> > AAAAAA
> > AAAAAA
> > AAAAAA
> > 
> > and so on.
> > 
> > My model has nothing to do with that "AAAAAA" messages. I have not saw such messages before.
> >
> > [...]
> >
> > Some time later I got this message:
> > 
> > Error: Java ran out of memory.  Running Java with a larger memory allocation
> > pool (heap) may fix this.  But it won't help if some state has an enormous
> > number of successor states, or if TLC must compute the value of a huge set.
> > Finished in 04min 10s at (2017-04-04 18:08:25)
> > 
> > This is strange because I ran TLC with the same parameters for virtual machine memory 
> > as before. 
> 
> 
> Hi Evgeniy,
> 
> the "AAAAAA" messages are obviously bogus but anyway harmless. My guess
> is, that the corresponding print statement in the code is a leftover
> from a debugging session.
> The reason why TLC shows several messages is, because your liveness
> properties have been split into 21 branches and TLC recovers a graph for
> each one them.
> 
> With regards to the out of memory error, are you saying that you only
> get the error if you recover from a checkpoint but don't get it, if you
> run the exact same model (same parameters, safety & liveness properties,
> ...) without recovery? If yes, can you make your checkpoint data and
> spec & model (privately) available to me?
> 
> By the way, when did you download the Toolbox? The version TLC reports
> indicates that you are running a nightly/CI build.
> 
> Cheers
> Markus

The whole story is as follows:
I downloaded TLA+ tools from some official web site (maybe Lamports website, I dont remember exactly)
I ran TLC on my model which produced aforementioned checkpoint.

Some time later I decided to stop TLC (ctrl+c). I did that after checkpoint was finished so there is no reason to suspect that checkpoint is broken.

I ran TLC again with that checkpoint (- but got an error which is a subject of this topic (NoSuchElementException)

Then I saw that message from mr.Giuliano, where he suggest to try latest build.
I downloaded that latest build and ran it again. I ran it with the same parameters as before except checkpoint interval parameter (It was increased this time). 
I used this link for download: http://tla.msr-inria.inria.fr/tlatoolbox/ci/tladist/

My original run command was as follows:
tlc RAPlus.tla -config RAPlus.cfg -deadlock -metadir /run/media/MC -checkpoint 5

My recover attempt was:
tlc RAPlus.tla -config RAPlus.cfg -deadlock -metadir /run/media/MC -checkpoint 30 -recover /run/media/MC/17-04-03-13-54-53

tlc is a script which is equal to:
#!/bin/bash
java -Djava.net.preferIPv4Stack=true \
     -Dcom.ibm.cacheLocalHost=true \
     -Djava.net.preferIPv6Addresses=false \
     -Xmx16g \
     tlc2.TLC -cleanup -workers 3 $*

> With regards to the out of memory error, are you saying that you only
> get the error if you recover from a checkpoint but don't get it, if you
> run the exact same model (same parameters, safety & liveness properties,
> ...) without recovery? 

I get this error on recovery using that latest version of TLC. I actually get memory error also when running latest version of TLC from scratch (after several hours of checking).
But I do not get that error when using previous version of TLC from scratch on that model.