[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tlaplus] Re: TLC: NoSuchElementException on checkpoint restore
вторник, 4 апреля 2017 г., 21:14:27 UTC+3 пользователь Markus Alexander Kuppe написал:
> On 04.04.2017 17:09, evgeniy....@xxxxxxxxx wrote:
> > After recovering from checkpoint I have a very weird messages displayed periodically:
> > tlc RAPlus.tla -config RAPlus.cfg -deadlock -metadir /run/media/MC -checkpoint 30 -recover /run/media/MC/17-04-03-13-54-53
> > TLC2 Version 2.09 of 28 January 2016
> > Running in Model-Checking mode with 3 workers.
> > Parsing file RAPlus.tla
> > Parsing file /home/sh/tla2tools/tla/tla2sany/StandardModules/Naturals.tla
> > Parsing file /home/sh/tla2tools/tla/tla2sany/StandardModules/Sequences.tla
> > Parsing file /home/sh/tla2tools/tla/tla2sany/StandardModules/TLC.tla
> > Parsing file /home/sh/tla2tools/tla/tla2sany/StandardModules/FiniteSets.tla
> > Semantic processing of module Naturals
> > Semantic processing of module Sequences
> > Semantic processing of module TLC
> > Semantic processing of module FiniteSets
> > Semantic processing of module RAPlus
> > Starting... (2017-04-04 18:04:14)
> > Implied-temporal checking--satisfiability problem has 21 branches.
> > Starting recovery from checkpoint /run/media/sh/MC/17-04-03-13-54-53/
> > AAAAAA
> > AAAAAA
> > AAAAAA
> > AAAAAA
> >
> > and so on.
> >
> > My model has nothing to do with that "AAAAAA" messages. I have not saw such messages before.
> >
> > [...]
> >
> > Some time later I got this message:
> >
> > Error: Java ran out of memory. Running Java with a larger memory allocation
> > pool (heap) may fix this. But it won't help if some state has an enormous
> > number of successor states, or if TLC must compute the value of a huge set.
> > Finished in 04min 10s at (2017-04-04 18:08:25)
> >
> > This is strange because I ran TLC with the same parameters for virtual machine memory
> > as before.
>
>
> Hi Evgeniy,
>
> the "AAAAAA" messages are obviously bogus but anyway harmless. My guess
> is, that the corresponding print statement in the code is a leftover
> from a debugging session.
> The reason why TLC shows several messages is, because your liveness
> properties have been split into 21 branches and TLC recovers a graph for
> each one them.
>
> With regards to the out of memory error, are you saying that you only
> get the error if you recover from a checkpoint but don't get it, if you
> run the exact same model (same parameters, safety & liveness properties,
> ...) without recovery? If yes, can you make your checkpoint data and
> spec & model (privately) available to me?
>
> By the way, when did you download the Toolbox? The version TLC reports
> indicates that you are running a nightly/CI build.
>
> Cheers
> Markus
The whole story is as follows:
I downloaded TLA+ tools from some official web site (maybe Lamports website, I dont remember exactly)
I ran TLC on my model which produced aforementioned checkpoint.
Some time later I decided to stop TLC (ctrl+c). I did that after checkpoint was finished so there is no reason to suspect that checkpoint is broken.
I ran TLC again with that checkpoint (- but got an error which is a subject of this topic (NoSuchElementException)
Then I saw that message from mr.Giuliano, where he suggest to try latest build.
I downloaded that latest build and ran it again. I ran it with the same parameters as before except checkpoint interval parameter (It was increased this time).
I used this link for download: http://tla.msr-inria.inria.fr/tlatoolbox/ci/tladist/
My original run command was as follows:
tlc RAPlus.tla -config RAPlus.cfg -deadlock -metadir /run/media/MC -checkpoint 5
My recover attempt was:
tlc RAPlus.tla -config RAPlus.cfg -deadlock -metadir /run/media/MC -checkpoint 30 -recover /run/media/MC/17-04-03-13-54-53
tlc is a script which is equal to:
#!/bin/bash
java -Djava.net.preferIPv4Stack=true \
-Dcom.ibm.cacheLocalHost=true \
-Djava.net.preferIPv6Addresses=false \
-Xmx16g \
tlc2.TLC -cleanup -workers 3 $*
> With regards to the out of memory error, are you saying that you only
> get the error if you recover from a checkpoint but don't get it, if you
> run the exact same model (same parameters, safety & liveness properties,
> ...) without recovery?
I get this error on recovery using that latest version of TLC. I actually get memory error also when running latest version of TLC from scratch (after several hours of checking).
But I do not get that error when using previous version of TLC from scratch on that model.