[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tlaplus] some help needed in running tlc in distributed mode



On 18.11.2015 09:32, jar...@xxxxxxxxx wrote:
> I am trying run TLC in distributed mode on AWS, and
> 
> - I have created a specification 'model' using Toolbox in directory 'tangled '
> - I have successfully run it locally as 'Model_1'
> - I have created two EC2 instances myInstance and myInstance2
> 
> I have read the instructions in https://tla.msr-inria.inria.fr/tlatoolbox/doc/model/distributed-mode.html#help, and tried the following
> 
> - copy specification && model && all related Toolbox stuff to master node myInstance: 
>     scp  tanged/ myInstance:/tmp -F ssh/config.aws 
> - copy tla2tool.jar to master node: 
>     scp -F ~/java/tla/tla2tools.jar myInstance:/tmp/
> 
> on master instance:
> - launch server as instructed 'distributed-mode.html#help':
>   cd /tmp && java   -cp tla2tools.jar tlc2.tool.distrid.TLCServer tangled/model.toolbox/Model_1/MC
> 
> Observation 1: 
> 
> - netstat -tulp shows that port 10997 is being listened, distributed-mode.html#help' talks about 10997
> 
> on slave instance (myInstance2)
> 
> -  access master machine port 10997 successfully with telnet 172.31.15.118 10997  
> 
> Observation 2: 
> - running wget http://172.31.15.118:10997/tla2tools.jar fails, and returns "HTTP request sent, awaiting response... No data received...Retrying" 
> 
> 
> Taking a peek at the code 'org.lamport.tla.toolbox.jcloud.CloudDistributedTLCJob' 
> 
> Observation 3: 
> - Toolbox creates environment e.g. 
>      mkdir /mnt/tlc/ && "
>      chmod 777 /mnt/tlc/ && "
>      ln -s /tmp/MC.out /var/www/html/MC.out
>      ln -s /tmp/MC.err /var/www/html/MC.err
>      ...
> 
> Questions:
> 
> - Are the instructions in https://tla.msr-inria.inria.fr/tlatoolbox/doc/model/distributed-mode.html#help valid?
> - If yes: what could be wrong with my setup?
> - If no: what instructions I should follow in order to be able to run TLC using tla2tools.jar from command line?
> 

Hi Jukka,

the instructions in distributed-mode.html are correct. Your problem is
that your master myInstance does not run the Toolbox. The Toolbox is the
one that binds to 10996 and serves tla2tools.jar. 10997, on the other
hand, is TLCServer's RMI port.

CloudDistributedTLCJob has no relevance in this context. Even though it
will eventually automate the steps above and deploy TLC in the cloud for
you, it is not yet able to handle multiple instances (contributions
welcome).

Cheers
Markus