[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Distributed TLC (was Re: [tlaplus] workers and cleanup options)

On 11/20/2012 08:18 PM, Ouiza wrote:
> I am running TLC in a context in which I have many clusters and each
> cluster is of many multiprocessor computers. I am using the workers
> option but only in one multiprocessor machine. Is there a way to do what
> I want to?, i.e run TLC in a cluster with many multiprocessor machines.

Hi Ouiza,

TLC - in recent version - comes with support for distribution, which
allows you to make use of the power of a cluster.

Unfortunately there isn't much documentation (yet), but it's actually
quite simple to get started:

If you run it from within the Toolbox, you have to check the "Run in
distributed mode" on the main model page under the "How to run" section
of your model. Afterwards, you either start the (remote) workers from
the command line or via Java web start. Instructions on how to start
workers, can be found on a web page served by your Toolbox [1].

If you prefer the command line, just copy+paste the "headless"
instructions and execute it on each worker machine. For Java webstart,
you simply open a browser on each worker and have it open page [1]. Java
Webstart will take care of the rest (on Windows it will even install a JRE).

Since you say you have a clusters of computers, you probably find the
command line somewhat easier. This is especially the case if you make
use of tools like clusterssh [2] or parallelssh [3] to launch many
workers from a single machine at once.

Some more verbose documentation is also part of the Toolbox help. Go to
the TLA+ Toolbox User Guide > Model Checking > Creating a Model > Model
Overview Page > Running TLC in distributed TLC.

Note that distributed TLC is still under active development and
enhancements as well as bug-fixes are made almost daily. If you run into
any problems, feel free to ping me.


[1] http://localhost:10996/ (replace "localhost" with the name of your host)
[2] http://sourceforge.net/projects/clusterssh/
[3] https://code.google.com/p/parallel-ssh/