getting started with tla+, are there bakery algo/election with leases examples I could follow?

To help me beginning learning how to translate algorithms into TLA+ for model checking,

I made up a little leased leader-election algorithm. I'll call it ALLCALL, and I'll write it out below.

ALLCALL supposes that there is a multicasting and point-to-point network available, and

that clock drift can be bounded.

My question is: how does one encode such an algorithm into TLA+ (or pluscal)? If there is a similar encoding (e.g. of the Bakery algorithm or similar) available somewhere, perhaps that could get me started.

Thanks for any and all pointers to examples/guides on how to translate this.

I tried reading the Raft TLA+ spec (https://github.com/ongardie/raft.tla/blob/master/raft.tla), but I was completely lost with what those i and j subscripts mean...

- J

# the ALLCALL leased-leader election algorithm

# Jason E. Aten, February 2017

Let heartBeat = 1 sec.

Let leaseTime = 10 sec.

Let maxClockSkew = 1 sec, where maxClockSkew

is a bound on how far out of sync our local

clocks may drift.

Given: Let each server have a numeric integer rank, that is distinct

and unique to that server. If necessary an extremely long

true random number is used to break ties between server ranks, so

that we may assert, with probability 1, that all ranks are distinct integers,

and that each server, absent an in-force lease, can be put into a strict total order.

Rule: The lower the rank is preferred for being the leader.

Ordering of lease then rank: we order the pair (leaseExpires, rank) first

by largest leaseExpires time, then by lower rank. If

both leases have expired, then neither is considered as a part of the ordering.

ALLCALL Algorithm phases:

I. Init phase: when a server joins the cluster,

it does not issue allcalls (a ping of all members)

until after leaseTime + maxClockSkew time has elapsed.

During init, the server does, however, accept and respond to allcalls() from

other cluster members. The allcall() ping will contain

the current (lease, leader-rank) and leader-id

according to the issuer of the allcall(). Every

recipient of the allcall updates her local information

of who she thinks the leader is, so long as the

received information is monotone in the (lease, leader-rank)

ordering; so a later unexpired lease will replace an

earlier unexpired lease, and if both are expired then

the lower rank will replace the larger rank as winner

of the current leader.

II. Ping phase: after a server has finished its Init phase, it

enters its ping phase.

During ping phase, the server continues to accept and respond

to allcall requests from other servers. Now in addition,

the server also issues its own allcall() pings every

heartbeat seconds.

Properties of the allcall():

The allcalls() are heard by all active cluster members, and

contain the sender's computed result of who the current leader is,

and replies answer back with the recipient's own rank and id. Each

recipient of an allcall() replies only to the sender.

So the sending of the allcall is a multicast (publish), but

the replying to an allcall is a point-to-point operation.

Conjecture: assuming that communication is maintained between

servers, then for the set of live servers at any point in

time, there can be only zero or one leader at a time, from

the point of view of any node in the cluster; and if there is

a leader (rather than not knowing), then there is

agreement between all "knowing" nodes on who that leader is.

This last conjecture is intended to be strong, on purpose, as I would like

to find a counter-example, if one exists.